dart-lang / language

Design of the Dart language
Other
2.61k stars 200 forks source link

Augmentations in Part files with imports, "augmentation path property" #3849

Open lrhn opened 1 month ago

lrhn commented 1 month ago

TL;DR: The path requirement on augmentations disallows augmentations in sibling parts. We should consider making that a lint, not a language requirement.

The specification of "Parts with imports" (https://github.com/dart-lang/language/pull/3800) which combines augmentation libraries and part files into on concept, adds a restriction that requires augmenting declarations to be "below" the prior base or augmenting declaration that they are applied after. Here "below" means later in the same file or in the part-file subtree of a part file of the same file.

Because every augmenting declaration has to satisfy that, relative to the previously applied on, the transitive effect is that the files that contain augmentations of the same declaration must be on a single downwards path in the part-file tree of the library, starting at the file of the base declaration. Two "sibling" part files cannot contain augmentations of the same declartion.

That requirement was added beause satisfying it made part directive ordering inside a file unrelated to augmentation application order, because all augmentations would be in a single sub-part. Which is nice.

It has the side-effect that you can't augment a declaration in a sub-part file and apply a macro to the base declaration which also augments the declaration, because the macro generated augmentation is put into a separate sub-part file which will be a sibling of the manual augmentation.

That's probably not a big problem, most people won't be augmenting their own declarations. They can just write the result of the augmentation directly (except for a few potentially useful cases, where you'd write the augmentation right beside the thing it augments, in the same file, and then there is also no problem).

It's still a little worrisome, and it means that macro generated augmentations must be collected into a single part file. If not for this, they could have been in individual files, one per macro per phase, and the only practical problem with that would be showing them when debugging. But with the path-requirement, it's not even allowed.

Proposal

I suggest that we remove the "path requirement", so that the only requirement is that an augmenting declaration must be below the base declaration it augments. (Even that isn't technically necessary, but it makes both reasoning about the code and specifying augmentation easier.)

Then we add a lint which warns if an augmentation is not below the previously applied augmentation. It's just a suggestion, it can be ignored, and the lint ignores augmentations generated by macros (generally shouldn't lint code the user didn't write). If you ignore it, you can't necessarily reorder part directives of a file, but then you probably know what you're doing.

The augmentation specification has to specify an augmentation application order, which is now (a little) more complicated because we can't assume that the augmentations are ordered one below another.

Definition: A declaration, A, in a library is before another declaration B (equivalently B is after A) in the same library if and only if:

This defines a total ordering on declarations of a library. (Proof sketch: Take any two declarations. Induction on tree structure, starting at the library file whose sub-tree must contain both. Continue downwards as long as neither is in the current file, and not in different part sub-trees, because then they must be in the same part.)

The augmentation application ordering of augmentations of the same base declaration is to apply an augmenting declaration after all augmenting declaration that are before it and before all that are after it.

jakemac53 commented 1 month ago

I definitely think it needs to be a requirement that we allow sibling parts to augment the same declarations. It is even a requirement to allow them to augment new declarations from sibling parts which were listed before them. A library macro can augment anything in the library.

lrhn commented 1 month ago

Which means that "path restriction" we semi-agreed to in the language-team meating is too strict if it also applies to macro-generated part files. And if it's a language feature, it does.

I'll update the "parts with imports" PR to remove it as a requirement, and suggest it as a (recommended) lint instead. ... and done!

This will have consequences. If all macro generated code of an entire library is appended as a final part file of the library file, then it comes after all other augmentations. However, if macro generated code is inserted in each file file where the macro application annotation occurs, then that can introduce macro generated code that is earlier in augmentation application order than user code. Example:

library lib1;
part "p1.dart";
part "p2.dart";
// p1.dart
part of "lib1.dart";
// part 'macro-generated+package:mypackage/lib/src/p1.dart'; // <- inserted here?
@magicMacro(#foo)
class C {
  int foo() => 42;
}
// p2.dart
part of "lib1.dart";

// ignore: augmentation_ordering
augment class C {
  augment int foo() {
     if (DateTime.now().millisecondsSinceEpoch.isEven) return augmented();
     return -1; // :P
  }
}

If the magicMacro introduces augmentations of the foo methods of the C class as a part file in the p1.dart file, the later sibling part, p2.dart then gets to augment that declaration (and any agumentations), because it's later in augmentation application order than everything in p1.dart.

While possibly convenient, it implies that introspection by @magicMacro if foo method should show the state after completing augmentations of p1.dart, but not include the augmentations of p2.dart, because the macro's augmentations are not applied to that class.

The alternative, and what I think we do today, is to collect all macro generated code from all parts ("augmentation libraries") in a final include from the library file, and we apply all user-written augmentations before the macro gets to do introspction. That's also consistent.

What must not happen is that the macro introspection sees a different declaration than the one its generated augmentations are actually applied to. That would be a source of subtle bugs.

jakemac53 commented 1 month ago

That is indeed quite an interesting situation.... I would need to review the current augmentation spec in depth to try and understand if it has a consistent design here or not.

But, I tend to think that introspection by @magicMacro if foo method should show the state after completing augmentations of p1.dart, but not include the augmentations of p2.dart, because the macro's augmentations are not applied to that class. is the right answer here?

This gives you a way to augment macro generated code which is nice (put the macro application in an earlier part, then you can augment it in a later part). And it also makes macro code "not special" - a macro in a part file can't do something that hand written code in that same file couldn't do.

jakemac53 commented 1 week ago

Circling back here, I do think it is a requirement that a macro running in p1.dart cannot see anything from p2.dart, if it is going to inject an augmentation into p1.dart and not at the top level of the library. Otherwise, it could try to augment something that it can't actually augment.

The alternative, and what I think we do today, is to collect all macro generated code from all parts ("augmentation libraries") in a final include from the library file, and we apply all user-written augmentations before the macro gets to do introspction. That's also consistent.

I think that doing this is probably the only sensible thing - I don't actually know how we could uphold the property above otherwise, especially in the case of library cycles.

Library cycles require that we complete phase 1 in all libraries, then do phase 2 in all libraries, then phase 3 etc. This would mean we couldn't block a macro in p1.dart from seeing stuff introduced in p2.dart without some really complicated logic to explicitly hide stuff, because it would have already been merged into the compilers model of the declaration.

So, we should only produce a single macro augmentation part for each entire library. Macro applications may appear in any part, but are just collected and ran together essentially if they were all in the main library. That means they can see and augment declarations from any part.

lrhn commented 1 week ago

SGTM. It shouldn't change anything for the model exposed to macros, it's just that the augmented declaration it can reflect on is not (necessarily) the completely augmented declaration.

Also worth remembering that there is no program where the later augmentations are applied directly to the pre-macro declaration. The program is not considered a Dart program until after macros have run. (Well, not except during development before adding the macro annotation, but a program in devlopment can be arbitrarily far from the final result.)

jakemac53 commented 1 week ago

Also worth remembering that there is no program where the later augmentations are applied directly to the pre-macro declaration. The program is not considered a Dart program until after macros have run. (Well, not except during development before adding the macro annotation, but a program in devlopment can be arbitrarily far from the final result.)

Can you elaborate on what you mean by this?

lrhn commented 6 days ago

I ususally say that the only code which can see the result of a partial augmentation application (a base with some, but not all, augmentations applied) is the next augmentation being applied.

That only applies to entire Dart programs, which are the only ones we can actually assign Dart semantics to.

Here we are inserting an augmenting declaration into the program at a point where it may apply to a partial augmentation application, and may be further augmented by already existing later augmenting declarations.

That doesn't mean that we change the program, so that a later augmenting declaration that used to apply to the partial application according to language semantics, will now apply to the partial application plus the new augmenting declaration inserted by the macro.

It's not a change because there is no before. The input to macro processing is not a program that has a Dart semantics. It's something that we (optimisitically) try to derive some properties of that we think might correspond to actual Dart semantics, but it's not real Dart semantics because those only apply to complete programs. There is no program where the later augmentation applied directly to the prior declaration. We could very well have stopped augmentation application at the point where we run a macro, only to continue it afterwards, and the only prior declaration the next augmentation ever sees is the one inserted by the macro.

Because macros see partial programs, it makes sense that the macro source inspection can also see the (preliminarily assumed) result of a partial augmentation application, because for all we know, all augmentations are partial until the macro has been applied. It's not something that happens in the middle of compiling, it's a pre-processing step that turns something not-Dart into Dart. After that, we can treat it as real Dart code where all the nice properties apply. (Well, it might happen during compilation, but that's just us being clever and optimizing and reusing the partial analyses we got during macro execution. From a specification perspective, it makes no sense to assign semantics to something which is not completely valid Dart.)

... That sounds like the way to process macros would be to go through all declarations in augmentation application order, pre-order depth-first, and when all sub-parts have been processed and their macros run, the macros of the current file are run. That ensures that all earlier declarations have been generated when the macros for a file is run, which means that all earlier augmentations can be applied before running macros that may be augmenting on top of them, and there won't be any more earlier declarations added at a later point. So generate macros in depth-first post-order traversal, because the result of a macro is adding a last part to the current file applying after all the earlier declarations.

jakemac53 commented 6 days ago

I definitely see the appeal in wanting to run macros in augmentation application order (pre-order, depth first). However, I am not sure it actually gives the desired result.

Consider the following:

@Equality() // adds ==, in terms of the fields (but not getters)
class A {
  @WithBackingStore() // creates a private `_x` field for whatever reason.
  int get x;
}

Today, the above works fine because we always run the inner macro first. The backing store is created prior to the equality macro running, and everything is fine and dandy. Or, it works because Equality() is implemented to add the definition in the final phase, after which the class is currently guaranteed to be API complete (but sometimes the API itself must be described in terms of the fields, for constructors or copyWith).

Consider that with augmentations, any of these macro applications could be in a different part, and these ordering guarantees about "outer macros run first" no longer apply, opening the door to all kinds of weirdness.

One option could be to just remove those ordering guarantees. This does make copyWith and constructors funky.

At a minimum I think we need to run each phase in sequence across the entire tree though, so for instance apply macros within each phase as a post-order depth first traversal of the augmentations, but fully complete each phase for the entire library before continuing to the next phase.