dart-lang / language

Design of the Dart language
Other
2.67k stars 205 forks source link

How can we augment an abstract variable? #4017

Open eernstg opened 3 months ago

eernstg commented 3 months ago

tl;dr   Allow augmenting an abstract variable by a concrete getter and/or setter, or another abstract variable   rd;lt

The augmentation specification has the following rules:

An abstract variable declaration is equivalent to an abstract getter declaration, and if not final, also an abstract setter declaration.

... abstract variables cannot augment their initializing expression, since it does not exist ...

It is a compile-time error if:

  • ...
  • An abstract variable is augmented with a non-abstract variable, getter, or setter.

Note that there's nothing new about the first rule, this is also the way abstract variables are specified in the feature specification. So it makes sense to expect that an abstract variable should be treated in exactly the same way as the corresponding abstract getter/setter pair, or at least very similarly.

Does this mean that we cannot augment an abstract variable declaration at all .. other than by another abstract variable, or an abstract getter and/or setter, e.g., in order to add metadata or DartDoc comment?

This seems to be unnecessarily strict: If the abstract variable is indeed equivalent to an abstract getter and possibly an abstract setter then we should be allowed to augment the getter and/or setter just like any other getter and/or setter. We do have this, in the section about augmentation of getters and setters:

The augmenting declarations can themselves be an implicitly induced getter or setter, or an explicitly declared one.

We may even be allowed to augment the abstract variable with another variable declaration:

Augmenting a variable with a variable: ... [not an error] ...

We should be able to use this to add metadata or DartDoc comments.

This is more than we can do with explicitly specified getters and setters:

Augmenting a getter and/or setter with a variable: This is a compile-time error in all cases.

It was mentioned (in commentary) that we cannot augment the initializing expression, but this is consistent with the perspective that the abstract variable is (precisely) an abstract getter and possibly an abstract setter, so it makes sense that we cannot augment the storage part of a (non-existing) variable.

In summary, I believe we should change the compile-time error mentioned above to the following:

It is a *compile-time error if:

  • ...
  • An abstract variable is augmented with a non-abstract variable.

@dart-lang/language-team, WDYT?

jakemac53 commented 3 months ago

I recently updated this section, see https://github.com/dart-lang/language/pull/4000 for the changes.

I think the current status is what we want.

jakemac53 commented 3 months ago

The tldr; is that you are allowed to augment abstract variables with abstract variables, getters, and setters. But not non-abstract ones.

This is a bit annoying (not being able to augment them with a non-abstract getter/setter), but it makes sense because you are not allowed to augment abstract getters/setters in general, so we are treating these implicit ones no differently.

The suggestion for now is to instead mark the variable as external. It isn't a great solution but works 🤷‍♂️ .

eernstg commented 3 months ago

That is surprising to me. I would expect that we can augment an abstract getter by a concrete one and vice versa (which is by the way what the analyzer does, too):

const metadata = 1;

class A {
  int get a;
  augment get a => 1;

  int get b => 1;
  @metadata augment get b;
}

So we can provide the interface (the unimplemented declaration) of the getter first, in order to let some other party (e.g., a macro) provide the implementation. We can also provide the concrete declaration first, and then add in a DartDoc comment, and we can let the augmenting declaration be abstract in this case (which is nice if we don't want to change the implementation). Why would these usages be errors?

An external variable declaration could be very similar to a regular storage-based variable declaration: There is a concrete getter and possibly a concrete setter, and there's nothing that prevents augmenting them with abstract or concrete getters/setters (other than error messages that we don't have to emit ;-).

(Right now, I'm looking for something in the feature spec that says that we can't override an abstract getter by a concrete one...)

eernstg commented 3 months ago

Didn't find anything so far.

I believe we could also omit the following compile-time error:

An external declaration is augmented with an abstract declaration. For variables this also applies to the implicit getter and setter.

const metadata = 1;

external void foo();
@metadata augment void foo();

external int i;
augment get i => augmented + 1;

The analyzer doesn't actually accept the above because it is a syntax error for the library function to have no body. However, this one is accepted:

const metadata = 1;

class A {
  external void foo();
  @metadata augment void foo();
}

class B {
  external int i;
  augment int get i => augmented + 1;
}

It could be argued that we have an overriding principle ("augmenting declarations must restate all modifiers"), but if the starting point is that the external part of the external declaration is actually a representation of the body then it doesn't work like those other modifiers (such as sealed, base, abstract on classes), and we do allow elements of an augmentation stack to have different bodies (or no body).

Similarly, the abstract modifier on a variable declaration means that it is a different notation for two abstract declarations (a getter and a setter, and they don't have the keyword abstract even though they are abstract), so it should be allowed to occur accordingly (that is, if and only if we could have written the getter/setter declarations manually).

eernstg commented 3 months ago

By the way, I think we should maintain that the declarations with no body are 'unimplemented', rather than 'abstract'. The former is true, and the implementation may be omitted entirely (in which case the definition is indeed an abstract member of the enclosing definition), or it is provided by an augmentation (in which case it's concrete), or the implementation is missing and must be present (in which case the definition is an error).

jakemac53 commented 3 months ago

Whenever the augmentation library spec uses the term "abstract" it means, explicitly abstract. Or at least that is how I have always interpreted it.

So yes, absolutely you can have a getter with no body, and augment it with a body. But if that original getter was explicitly marked abstract (with the explicit keyword) then you cannot.

We don't allow changing the meaning of things via augmentation, and if something is explicitly marked abstract that is actually a meaningful property.

jakemac53 commented 3 months ago

We could possibly say that an abstract field introduces a potentially abstract getter and/or setter? Instead of specifying that it must be abstract? And then we could allow augmenting it with getters/setters that do have bodies?

But, this feels a bit weird to me. When somebody writes abstract int x; today, you know this is really just enforcing an interface which must be fulfilled by some subtype. It feels a bit weird if that can introduce members which are not abstract. Also, in that scenario would we allow augmenting it with a non-abstract getter, but not a setter? So it is sort of half implemented?

lrhn commented 3 months ago

If you can write int get x; augment int get x => 42;, then you should also be able to write abstract final int x; augment int get x => 42;.

An abstract (instance) variable declaration is equivalent to an abstract getter declaration and an abstract setter declaration.

But, this feels a bit weird to me. When somebody writes abstract int x; today,

Today we don't have augmentations. If someone writes something, it's what you get. With augmentations, it may or may not be what you get.

you know this is really just enforcing an interface which must be fulfilled by some subtype.

It's introducing a getter and a setter to the interface. No more or no less.

From a specification stand-point, I'd like to treat abstract final int x; exactly the same as int get x;. They both introduce the same abstract getter definition, and such a getter definition can be augmented by an ... augment int get x ....

I'd allow an abstract final int x to be augmented by:

Of these, only the first is not obvious, but it's consistent to treat an augment abstract final int x; the same as an augment int get x;. It's the same thing, with or without augment.

jakemac53 commented 3 months ago

I see that I was also confused above, because I was thinking that abstract int get x; was a thing, which it is not. You cannot explicitly declare a getter as being abstract at all. The class can be abstract which allows a getter or setter to not have a body, and that is it.

So, I do now agree that abstract final int x; should be exactly the same as int get x;. And it should be augmentable with a body. In fact, all abstract getters and setters should be augmentable with a body.

I think that means I need to undo some stuff I landed earlier but that is fine, I can work on it.

leafpetersen commented 3 months ago

In fact, all abstract getters and setters should be augmentable with a body.

Presumably then the same for methods?

jakemac53 commented 3 months ago

Presumably then the same for methods?

Yes

eernstg commented 3 months ago

Here's the conceptual semantic model that I like to use to describe what these augmentations of properties are doing: A property may contain 4 components, namely a getter implementation, a setter implementation, a storage location, and an initializing expression. It also has a type, but this is a separate topic so I'll just focus on the actual run-time entities here.

For the getter implementation, we can have a stack of contributions. Each of them can invoke the previous one using augmented, if there is one. Similarly for the setter implementation.

The storage location is present or not present (we don't want to have several because all but one of them will be a space leak). So every contribution of a storage location will just serve as a constraint that it exists. A storage location is always provided by a declaration that also has an implicitly induced getter, and possibly setter.

The initializing expression is handled similarly: There are zero or more contributions, and if there are multiple then the non-first ones can evaluate the previous one using augmented. If there are no contributions at all then it is an error if the type of the property is potentially non-nullable, and it is implicitly provided as null if the type is nullable.

We then have a matrix where each declaration is a row with 4 elements, representing a stack consisting of one introductory declaration plus zero or more augmenting declarations.

In the following, a setter/getter which is implicitly induced is marked as G (for 'generated'). A setter/getter which is written manually is marked as M. The storage location is marked as '*' if present, and blank space otherwise.

Finally a declaration that contributes an initializing expression is marked by '*', and one that doesn't has a blank space.

// Components                    // Get Set Storage Initializing expression
int x;                           // G   G   *
augment var x = 10;              // G   G   *       *
augment get x => 42 + augmented; // M
augment set x(_) {}              //     M

We're reading the matrix from below, choosing the relevant column. An empty space means "same as above".

This means that the semantics of x is that it has storage, it has a getter implementation that consists of a manually written getter that invokes the previous getter (the implicitly induced one for the given storage). The implicitly induced getter doesn't call the previous one (which is by the way the same one).

The property has a manually written setter that doesn't invoke the previous setter (that is allowed, too).

Finally it has an initializing expression because one was contributed by the first augmentation.

Note that we can also choose to create this property starting with an abstract variable, which means that there is no storage, no getter implementation, and no setter implementation initially (but they are all added by the first augmentation so the end result is the same):

// Case one: General example.

// Components                    // Get Set Storage Initializing expression
abstract int x;                  //
augment var x = 10;              // G   G   *       *
augment get x => 42 + augmented; // M
augment set x(_) {}              //     M

We can also choose to implement the abstract variable by the concrete variable, and stop there:

// Case two: A standard abstract-then-concrete approach.

// Components                    // Get Set Storage Initializing expression
abstract int x;                  //
augment var x = 10;              // G   G   *       *

It should not make any difference if we choose to introduce the getter and setter separately:

// Case two dot five: A standard abstract-then-concrete approach.

// Components                    // Get Set Storage Initializing expression
int get x;                       //
set x(int _);                    //
augment var x = 10;              // G   G   *       *

Similarly, we can choose to implement the abstract variable using a getter and a setter, in which case the resulting property does not have storage and does not have an initializing expression:

// Case three: Another abstract-then-concrete approach.

// Components                    // Get Set Storage Initializing expression
abstract int x;                  //
augment get x => 42 + augmented; // M
augment set x(_) {}              //     M

It makes sense that we can't specify the initializing expression without requiring that the property has storage (because we'd want to store the result of evaluating that expression at that storage location), but this is enforced because we can only specify an initializing expression by means of a concrete variable declaration with an initializer.

Here is a case that @lrhn and I discussed this afternoon, which seems somewhat inconvenient or weird:

// Case four: Weird.

// Components                    // Get Set Storage Initializing expression
abstract int x;                  //
augment get x => 42 + augmented; // M
augment set x(_) {}              //     M
augment var x = 10;              // G   G   *       *

The semantics of this declaration is just like int x = 10;. It has the generated getter and setter (which means that we will never be able to call the manually written ones), and then it has storage and an initializing expression.

The inconvenient part is that we cannot use this to specify the initializing expression without "killing" the manually written getter and setter, unless we are somehow able to change the order of the declarations.

We could allow for an augmenting property declaration to specify an initializing expression and not discard any existing getter and/or setter implementations by specifying that this particular property declaration doesn't have a getter or setter implementation. But that's exactly what we can already do using abstract!

// Case five: Allow "late" contribution of initializing expression.

// Components                    // Get Set Storage Initializing expression
int x;                           // G    G    *
augment get x => 42 + augmented; // M
augment set x(_) {}              //     M
augment abstract var x = 10;     //                 *

OK, that is weird. But it is consistent.

In order to do this we'd need to allow an abstract variable declaration to have an initializing expression, at least when it is an augmenting declaration. This similar to a number of other syntax adjustments that are needed in order to allow a number of relevant and useful combinations of elements in declarations that occur in a stack of augmentations.

However, I expect that the abstract-then-concrete scenarios will be quite common, and it is nice that the abstract property can be implemented by a getter/setter pair or by a variable, that choice is up to the augmenter and not baked in already with the introductory declaration.

I also expect that the "provide or augment an initializing expression after getter-and-or-setter" scenario will be less common, which could serve as an argument why it's a less serious problem that the syntax for that case may be a surprise to some developers.

I know that @lrhn and I do not agree on this interpretation, but I wanted to present one way to look at this semantics that seems consistent and comprehensible to me. Perhaps some other model comes up which is consistent as well, but has other properties, which is fine.

To me the main point is simply that there must be a story that we can tell the world such that everyone can think about this in a way which is reasonably simple. At the same time, the semantics must be useful.

So WDYT?

jakemac53 commented 3 months ago

Note that we can also choose to create this property starting with an abstract variable

We could choose to allow that yes but the question is why? The abstract keyword is totally superfluous, and only serves to confuse IMO. If you want a backing store, don't declare it as abstract.

This also isn't entirely a pedantic choice. There are meaningful differences between getter/setter pairs and variables with backing stores, in particular when it comes to constructors and introspection on programs. Allowing something to change between having a backing store and not via augmentation (and thus during macro execution) means that introspection results would change between macro phases which is an explicit thing we want to avoid.

Consider for example a data class macro, it needs to create a constructor parameter for each non-abstract field (which doesn't have an initializer), so that they can be initialized. It needs to do this in phase 2 (when it can declare the constructor signature). However, in phase 3 if some other macro could come along and augment an abstract field with a non-abstract field (without an initializer), now the constructor is not initializing all the fields.

Even without macros, there are weird cases here, consider the following:

abstract int x;
augment int get x => 1;
augment int x = 2;

Ultimately while sure you can create some consistent meaning for this, the use cases are not compelling. I don't think it leads to a better language.

We also don't need the pure getter/setter vs field symmetry when it comes to augmentations, because augmentations only apply to declarations you own. There is no concern about something from a superclass changing from a field to a getter/setter, and that being a breaking change. Since you can only augment things you declare, you are always in control.

eernstg commented 3 months ago

The most important consideration as I see it is that the mechanism is conceptually accessible. That is, we should have a story about what it means to declare and augment a property which makes sense, and which is easy for developers to remember and use in practice.

It is also important that the resulting feature has a reasonable amount of expressive power and flexibility, and that it handles important cases well.

I think the model I mentioned could work. Surely, there could be other models, and some of those other models might be better in various ways. So, by all means, let's get them on the table.

However, I don't think the current specification makes a clear model available. When I created https://github.com/dart-lang/language/issues/3958, it was actually in response to a general sense of confusion: "How would I know whether this particular sequence of augmentations is allowed? How would I know what it does?"

The model I'm proposing above considers a property to be a 4-tuple containing a getter implementation, a setter implementation, a constraint that says "must have storage" (so we can require it twice and it makes no difference), and an initializing expression. All but storage are provided as a stack of contributions. Different declarations provide different subsets of the 4-tuple (abstract int x; provides none of them, int x = 1; provides all four).

In the end, the semantics of the property is that (1) it has storage if and only if storage is required by any element in the augmentation stack; (2) the getter is built from all contributions based on the textual ordering and the use of augmented (yes, it is possible to ignore the rest of the stack by not invoking augmented); (3) the setter is similar; and (4) the initializing expression is built from the stack of contributions, similarly to the getter and setter.

If there are no contributions at all then the resulting property is abstract (it may be an abstract getter or an abstract getter/setter pair). Otherwise, if there are no contributions to the initializing expression then (1) if the property has storage and the type is nullable then null is implicitly provided as the initializing expression; (2) if the property has storage and the type is potentially non-nullable then an error occurs; (3) if the property has no storage, but it does have an initializing expression, then an error occurs; (4) no action is taken if the property has no storage and no initializing expression, that's fine.

The abstract keyword is totally superfluous

As I see it, the abstract keyword reserves the freedom of the augmentations to choose whether it will be implemented as a getter/setter pair (or as a lone getter, if final and not late final-without-initializer) or as a concrete variable.

As I mentioned before, this is really about being unimplemented by the given declaration, not about being an abstract instance member (in particular, it should be supported for top-level variables and static variables as well). However, it seems reasonable to use the modifier abstract for this purpose because it already has the desired meaning when applied to an instance member.

Given that we'd presumably have a space leak if we omit abstract:

// Components                    // Get Set Storage Initializing expression
int x;                           //
augment get x => 42 + augmented; // M
augment set x(_) {}              //     M

... I don't see how abstract could be considered superfluous. Space leaks do matter in some situations.

If you want a backing store, don't declare it as abstract.

Right, but in this particular case the backing store is a space leak, so we do not actually want it.

Even without macros, there are weird cases here, consider the following:

abstract int x;
augment int get x => 1;
augment int x = 2;

In this case we would have

// Get Set Storage Initializing expression
//                                         The introductory declaration contributes nothing.
// M                                       1st augmentation provides a manually written getter.
// G   G   *       *                       2nd augmentation fills in all slots, ignores M.

I'd recommend that we compile this with no errors, but emit a warning that the manually written getter is ignored (this would be a 'dead code' warning that applies to the entire declaration).

There's nothing inherently wrong with language features that may be combined to have properties like 'dead code', but it makes sense to help developers to avoid such combinations.

On the other hand, it doesn't really make sense to do anything else: We could say something weird like "a concrete variable declaration contributes a getter, a setter (iff not final yadda-yadda), an initializing expression iff syntactically present, and a storage requirement, except that it doesn't contribute a getter if it will augment (that is, 'it will eliminate') a manually written getter, and similarly for the setter, if it exists; if a getter or setter has been omitted based on the latter rule then an implicitly induced getter/setter is implicitly added to the introductory declaration".

I don't know what it would take to make the above declaration do what it is intended to do, if anything. I'd prefer to have the simple rule ("if we contribute a getter in the stack then we contribute a getter, period"), and then we can warn that the manually written getter is dead.

We also don't need the pure getter/setter vs field symmetry when it comes to augmentations, because augmentations only apply to declarations you own.

You don't think it could be useful to

To me that seems equally useful as being able to do the same thing with overriding:

abstract class A { abstract int x; }
class B extends A { int x; B(this.x); }

There is no concern about something from a superclass changing from a field to a getter/setter, and that being a breaking change. Since you can only augment things you declare, you are always in control.

True, the entire augmentation stack is in the same library. But if parts of it is created by a macro then we are not in control after all. So if you want to allow the macro to make choices like this then you may need to be able to specify the property in a way which is similarly flexible as the class A does above.

Admitted, we should ideally say unimplemented rather than abstract, but we probably can't pay such a high price syntactically in order to make that distinction explicit.

jakemac53 commented 3 months ago

To me that seems equally useful as being able to do the same thing with overriding:

abstract class A { abstract int x; }
class B extends A { int x; B(this.x); }

That is useful because it allows subtypes to have different implementations. If B was the only implementation, and defined in the same library, I would ask why x is declared as abstract. It doesn't really make sense imo and I would comment on that PR.

In the case of augmentations, this is always the situation. There is only the one true implementation that is being provided (possibly with many layers, but it is all there as a part of the same library).

Consider this alternate situation:

abstract class  A {
  @SomeMacro() // adds `augment int x = 1;`
  abstract int x;
}

If I am now an author who wants to extend A, how am I supposed to know that I am not actually supposed to implement x? I will end up doing something like:

class B extends A {
  int x = 2;
}

And now there are two fields named x...

I do think abstract as declared is not a property that we should allow you to erase via augmentation, at least when it is explicit like that? You are very explicitly asking for no backing store.

lrhn commented 3 months ago

It doesn't really matter whatever you implement abstract int x as

  augment int x = 42; 

or

  int _x = 42;
  augment get x => _x;
  augment set x(Int value) => _ x = x;

If a subclass thinks it should override x because the declaration looks abstract, it will behave the same in either case. The abstract keyword or presence of storage is a red herring. The confusion comes from the original declaration looking abstract, but getting an implementation, any implementation, from augmentations.

If we should change anything, then it is to mark declarations that will get an implementation differently from those intended to be abstract. A syntax for non-abstract declarations that have no implementation... yet.

Then you also cannot add implementation to an actual abstract declaration.

Maybe we should have an empty body market.

int x = ?;
void foo() {?}
eernstg commented 3 months ago

The confusion comes from the original declaration looking abstract, but getting an implementation, any implementation, from augmentations.

That is equally true for this case:

abstract class A {
  int get x;
}

augment abstract class A {
  augment get x => 1;
}

(I made the introductory declaration of A abstract such that it isn't locally evident that the getter must be implemented by an augmentation.)

We could make that an error, but I haven't heard anyone suggesting that it should be. We should at least be consistent.

Also, I don't see a need to require

  int _x = 42;
  augment get x => _x;
  augment set x(Int value) => _ x = x;

when

  augment int x = 42;

will do just fine.

Maybe we should have an empty body [marker].

That would certainly make sense! I suspect that several things would be less confusing if we do that. It would also be a bigger change to the grammar (I think?), but it might very well be worth it.

lrhn commented 3 months ago

We should be consistent.

I see no reason to distinguish between abstract final int x; and int get x;. If we want one of them to opt-in to getting a concrete implementation, both should. We may have to find two syntaxes for that.

If we allow:

abstract int x;
augment int x = 42;

then we should also allow

int get x;
set x(int _);
augment int x = 42;

because it is the same thing.

And we can decide that you cannot change the abstract-ness of the original declaration. That also means;

int get x;
augment int get x => 42;

is invalid.

But not everything has to be augmentations. Maybe we can allow you to do:

abstract class C {
  int get x;
}
augment abstract class C {
  int get x => 42;
}

That's not a augment of the x getter, it's an augment of the C class which adds a concrete x getter where there was none before. (There was an abstract x getter, but that's just interface, which this one is implementing.)

We don't allow that today, no two non-augmenting declarations with the same name, but we could.