dart-lang / language

Design of the Dart language
Other
2.61k stars 200 forks source link

augmented modifier for members? #3580

Open johnniwinther opened 5 months ago

johnniwinther commented 5 months ago

When generating macros in three phases, the situation can occur where the second phase adds unimplemented members that are implementated in phase 3. For instance we might have

@AddMemberMacro()
class Class {}

from which the second phase creates

augment class Class {
  @AddMemberBodyMacro()
  int method();
}

with the intent for the third phase to create

augment class Class {
  augment int method() {
    print('Hello');
    return 42;
  }
}

to complete the class.

A problem arises from this because outlines (aka summaries) are created after the second phase, and with that, the checking of the class hierarchy. For the example above this leads to an error being reported that the non-abstract Class has an abstract member method.

Currently macros have to avoid this either by generating a dummy body:

augment class Class {
  @AddMemberBodyMacro()
  int method() => throw UnimplementedError();
}

or mark the member as external:

augment class Class {
  @AddMemberBodyMacro()
  external int method();
}

Both have the problem that if the macro fails to add the intended body, no error is reported. (The support for external members is very much dependent on the backend so we might get an error at compile-time or runtime.)

I suggest that we add an augmented modifier to use in this scenario:

augment class Class {
  @AddMemberBodyMacro()
  augmented int method();
}

In the class hierarchy checking this should be handled as if the method being non-abstract. After phase 3, a check should be made that the member was in fact augmented to have a body.

A suggested alternative is to simply assume the "augmented" modifier on abstract members added in phase 2, but this poses a problem for when the member was intended to be abstract (adding members to interface classes for instance) or when the member was intended to be non-abstract but was added to an abstract class. We wouldn't be able to tell these apart without the explicit modifier.

johnniwinther commented 5 months ago

cc @jakemac53

lrhn commented 5 months ago

So the issue is that macros, and augmentations in general, allows some declarations to be incomplete in ways that would currently be an error, as long as an augmentation of that declaration adds the missing parts.

For augmentations, that's not (necessarily) an issue, because the entire library is assumed to be available at compile-time, and all the augmentations are applied syntactically before we need to check if the declaration is complete.

Macros, with their three-step staging, introduce steps where the program might still not be complete, because not all augmentations are known yet. We can't say whether a program is complete until after the final step, when all augmentations have been generated, and applied.

Here, it introduces an ambiguity because int method(); is a valid complete declaration of an abstract method, but potentially an invalid declaration in a non-abstract class. We can't see that it's incomplete at the point where we'd normally check that.

We could check later, but if we write outlines before step 3, and the outline includes information about whether the member is abstract or not, then that's a commitment that is not guaranteed by the source at the time. But we also don't want to do type inference before writing outlines, because we want writing outlines to be self-contained, which means syntactic. And we don't have the syntax yet.

So the suggestion is to introduce a new modifier, like abstract and external, that can be put on declarations-without-implementation, this one meaning that the body will be supplied by an augmentation. A placeholder which counts as having a body, and which will be an error later if there ends up being no body. Basically, the ability to distinguish an abstract method from an not-yet-augmented-with-a-body method.

And it would then be an error to augment an abstract method with a body, because that would also violate the API written into the outline.

It's possible, but I also worry that it will make augmentation more verbose. Using augmented for it would require writing augmented in the main library, which means making it a built-in identifier too, not just a contextual keyword in bodies of augmentation members in augmentation files.

And it requires adding augmented before abstract declarations that would be invalid by themselves, but not before ones that would override a super-member, and would therefore be valid as abstract. That distinction can be confusing.

Say I write something like

@ValueClass()
class Point {
  final int x;
  final int y;
  augmented JsonMap toJson();
  augmented String toString(); // Not needed, because implemention exist
}

That said, there is something reasonable about allowing augmentations to override existing members as they please, but still distinguishing the situation where the original knows that it will be augmented.

So let's say we do this:

Going back to the original example:

@AddMemberMacro()
class Class {}

from which the second phase creates

augment class Class {
  @AddMemberBodyMacro()
  augmented int method();
}

with the intent for the third phase to create

augment class Class {
  augment int method() {
    print('Hello');
    return 42;
  }
}

The augmented goes away in the second augment int method().

If there was an intermediate 2.5th augmentation of

augment class Class { 
  @mustCallSuper
  augment int method();
}

it would be invalid because the method is not abstract (as declared by augmented int method();), so this declaration would have to either add a body or a prefix augmented to keep the method non-abstract.

Or we can go further back and consider whether we should have a completely different syntax for a non-abstract-no-body-yet constructor, so int foo(); is not ambiguous. Currently that syntax means "Add int foo(); to the interface, not to the class", whereas augmented int foo(); means "Add int foo() to both interface and class, let me get back to you with the body". If the "must be augmented" isn't a signal we want to have independently of this issue (and I'm not sure it is), then just having a syntax for "no body yet" might be cleaner.

Consider, strawman syntax, int foo() ...;. That's a way to write a non-abstract method without a body (yet). That declaration will always be invalid if no augmentation is applied which provides a body, and it declares, up-front, that a body will be supplied. It only makes sense in positions where a body can be written. Probably variables too, final int x = ...; where a later augmentation can provide the initialization expression.

Not as verbose. Fairly readable (IMO). Does use some of our limited syntax, rather than just adding from an almost limitless supply of modifiers.

jakemac53 commented 5 months ago

It's possible, but I also worry that it will make augmentation more verbose.

Yes, this is a definite worry and why this keyword doesn't exist today.

Augmentations can add entirely new declarations, without anything special on the original class. It feels weird if you have to put something special on members, when augmenting those is probably less unexpected if anything (from a user perspective).

It sounds like maybe we have to run phase 3 during summary generation?

scheglov commented 5 months ago

In the analyzer we run all 3 phases during linking summaries, because even the definitions of the macro-generated library augmentation are based on elements, and so the whole code of the augmentation, including bodies, is a part of the linked summary.

So, for the analyzer we don't need this augmented promise, if it is augmented, we will know it.

johnniwinther commented 5 months ago

The intent with the augmented modifier was not to require it to be able to augment but just to allow generating concrete methods without a body with the check that they will eventually have a body. (This is actually how we currently use external in our patching system, but without the later check which we could have benefitted us from time to time). So it would not be needed from regular hand-writing augmentation libraries.

lrhn commented 5 months ago

If this is only a macro issue, it should probably not be a language feature, just a macro code generation API feature. When adding the step 2 augmentation member, it could be required to specify whether there will be a step 3 body or not. I'm not familiar with the API used to add members, but would something like that be possible?

Would it be useful for step 1 as well, having to say whether there will be more in later steps, or if this is it?

jakemac53 commented 5 months ago

@johnniwinther what makes the macro case different from a user writing this?

class Thing {
  @ProvideDefinitionInPhase3()
  void doSomething();
}

We want that to be possible as well, as long as the macro fills in a body.

johnniwinther commented 5 months ago

@jakemac53 Interesting point. It has the same problem if we don't generate the body until phase 3.

Can we tell by the interface of the macro that it is going to generate a body?

jakemac53 commented 5 months ago

Can we tell by the interface of the macro that it is going to generate a body?

Not definitively, no. There are two issues:

So, you could tell whether "potentially" a macro might fill it in, but not for sure whether it will.

lrhn commented 1 week ago

The more I think about this, and talk to @johnniwinther, the more I like the idea of declaring a non-augmenting declaration as "non-abstract", even if it looks abstract.

It's not just for macros. If I split my class into several parts using augmentations, then the top-level declaration of

abstract class Foo {
  State initializeState();
}

may look like initialize is abstract, but I know that it is filled in by an augmentation in a sub-part somewhere.

I can write

  State initializeState() => throw UnimplementedError("Replaced by augmentation");

but if I make a mistake and don't actually augment it, that turns my programming error into a runtime error.

So, how about this completely outrageous strawman:

 State initializeState() TODO;

We make TODO a contextual keyword, which we can use where a method body could otherwise be (where an async or sync* contextual keyword could occur, just not followed by a real body.

This is a method without a body, but it's a non-abstract method without a body, so it's a compile-time error if there is no augmentation in the program that adds a body to the method.

You don't have to write the TODO. An augmentation can still augment an abstract declaration with a body (unless that breaks summaries or something).

Less strawman-y ideas:

  State initializeState() =>;  // Has "no body". The sequence `=>;` is recognized specially, just like `TODO;`.

or

  State initializeState() augment; // Meaning "augment here!"

or

  State initializeState() late; // Meaning "body supplied later!"

The last one is probably a bad(der) idea, since it mixes the concepts of runtime lateness with compile-time lateness.

It's harder to find syntax for variable initializers. Take:

  int? foo;

which is augmented with an initializer later.

 int? foo TODO;

doesn't work, because it's ambiguous for final foo TODO; (the type being optional means we have to be able to determine the parsing by the token after the name. We can't use a context-sensitive keyword or built-in identifier there.)

 int foo =;

looks more promising, like =>; it uses the starting syntax of the thing that's omitted, then omits the content.

(If we do nothing, users can always choose to do Never get TODO => throw UnimplementedError("Not augmented"); and State initializeState() => TODO;, with all the problems that entail, but with some actual documentation effect.)

jakemac53 commented 1 week ago

Could we allow something like a !abstract keyword, to opt out of the implicit abstractness? In general if we need a keyword for this that could be fine. It does seem unfortunate though.

You would end up seeing this a LOT in macro generated code - and then an augmentation of the member immediately below it. Pretty weird stuff.

lrhn commented 1 week ago

It's very, very handy to just write int foo(); and not have to do the abstract int foo(); redundancy. Except that with augmentations it's no longer redundant. :disappointed:

I'd really prefer to not force everybody to write abstract in front, and migrate the world, so having some way to say "this member isn't really abstract, it just looks like it, but an augmentation will come along and fix it" seems useful.

I don't think you should have to use that either, it's mostly for documentation and sanity checking. Non-macro augmentations can always see all the augmentations, so they know whether there is a body or not.

It'd be a marker that makes it a compile-time error to not have a body at the end of augmentation. Like (a langauge feature like) @override makes it an error to not be an override, but if you don't write anything, it's still fine. And then macros can check for the marker, and add them in phase 1 or 2 if they intend to add a body. And we might be able to make it an error for a macro to add a body if it hasn't declared that it wanted to. That's macro API stuff.

The final proof of validity is still that the complete program compiles. If it ends up being valid, even without the needed information, that's nice. (If modular compilation works, obviously.)