For extension type constructor augmentation, relax the rule about initializing formals?

eernstg commented 3 weeks ago

Thanks to @sgrekhov for bringing up this topic! Consider an augmenting declaration in an extension type declaration:

extension type A(int b) {
  augment A(int b) {
    assert(b > 0);
  }
}

This example is used in the augmentation feature specification, and it is implied that it is correct. However, the existing proposal about primary constructors is used in the reasoning about this example, and it doesn't quite match.

Primary constructors are not yet part of Dart, but it is expected that they will be added, and also that the <representationDeclaration> of an extension type declaration will become part of a primary constructor declaration (that is, (int b) in the example above will become a primary constructor, and in other cases where will be some extra elements, e.g., const before the name, hence 'part of').

As a primary constructor, the declaration above has the following meaning:

extension type A {
  final int b;
  A(this.b);
  augment A(int b) {
    assert(b > 0);
  }
}

This conflicts with the general rule about augmentations that a formal parameter of a constructor must consistently be an initializing formal in the introductory declaration and in each augmenting declaration, or it must consistently not be an initializing formal. (That is, we can't go back and forth, we have to choose either int b or this.b, and stick with that.)

The example could just be changed to use the form which isn't an error according to this reasoning:

extension type A(int b) {
  augment A(this.b) {
    assert(b > 0);
  }
}

// ... which means:
extension type A {
  final int b;
  A(this.b);
  augment A(this.b) {
    assert(b > 0);
  }
}

However, this could also be a good opportunity to discuss the rules about constructor parameters (in general, not just for extension types): We could allow a stack of augmenting declarations of a constructor to use an initializing formal (like this.b or int this.b or {required this.b = 0}, etc) or a regular formal parameter (e.g., int b or {required int b = 0}) interchangeably.

This would allow us to use the first version of the example, which is arguably more readable.

// Then we'd allow the following:
extension type A(int b) {
  augment A(int b) {
    assert(b > 0);
  }
}

// ... which means:
extension type A {
  final int b;
  A(this.b);
  augment A(int b) { // OK, just say `this.b` once in an augmentation stack.
    assert(b > 0);
  }
}

This would immediately generalize to classes, enums, etc: The fact that a given formal parameter is an initializing formal must be stated at least once in an augmentation chain, the remaining declarations need not be initializing.

The status of being an initializing formal in the definition (the result after merging) of the given constructor would then be determined by the existence of an initializing formal: If there is one or more than one initializing formals then it is initializing, otherwise it is a regular formal parameter.

An extension type with a <representationDeclaration> would always implicitly introduce a constructor with an initializing formal, which means that we would also have this semantics for extension types (which is necessary because we must initialize the representation variable of an extension type).

In any case, the status of being an initializing formal is a piece of implementation, not an API property, and this seems to justify that a stack of augmenting declarations can be allowed to specify or specify that particular implementation property.

@dart-lang/language-team, WDYT?

lrhn commented 3 weeks ago

I agree on the conclusion. :+1:

As a primary constructor, the declaration above has the following meaning:

... and then follows a desugaring. Don't desugar! :wink:

A primary constructor declaration introduces a constructor definition (to use my new terminology) into the class definition of the class declaration. And it introduces a variable definition, here named b with type int, marked final and with no initializer expression.

That constructor definition definitely has a positional parameter typed int, and executing the constructor definitely initializes the storage location for the variable named b of the current object to the value of that argument.

That's all we have to say, and should say. The positional parameter does not have to have a name. It does not have to be an initializing formal. Another valid desugaring is A(int b) : b = b;, and there is no reason to consider one implementation of an implicit constructor more correct than the other, when there is no source code for it. I would choose to say as little as possible. The parameter is positional and has type int. It has no (known) name, and it can/should be treated as having no name. It's not known to be an initializing formal.

And therefore it should never matter outside of the declaration itself whether a parameter is an initializing formal or not.

There should be no visible difference between C(this.x) and C(int x) : this.x = x;. If there is, we're leaking implementation details. What is important is that the constructor initializes x. That it must remember.

So for extension typpes, I'd say that an extension type EName(RType rid) ... declaration has the following effect:

It introduces an extension type definition with name EName and representation type RType (and any superinterfaces declared in the ....
It introduces into this extension type definition:
- A non-redirecting generative extension-type constructor definition with name EName/Ename.new, a single positional parameter with type Rtype and no name, which is known to initialize the representation value. When executed to initialize a value of type EName with an argument value o, it will initialize the representation value to o.
- An extension type instance getter definition with name rid and return type RType which when invoked with a this bound of o will return the value o.
- (And then it introduces definitions for all the non-augmenting declarations of the body, and updates existing definitions for all augmenting declarations in the body.)

This phrases the execution of a generative extension-type constructor as something you do to initialize a value, and the effect of that constructor is to initialize that value to something of the representation type. Can also call it "initializing the this variabl", like filling a storage location the way non-redirecting generative non-extension constructors do for their fields. An initializing formal or initializer list entry of a non-redirecting generative extension-type constructor has to be for the representation identifier, and executing either will initialize the value for the constructor invocation.

You can augment the implicit constructor definition the same as any other constructor definition. (Or we can say that you can't, and you have to add a nother constructor if you want to augment. I always declare extension types as extension type const Name._(RType _) anyway, and add public constructors explicitly if I want them.)

The only problem here is, as @eernstg says, that the current specificaiton says that you have to repeat initializing formal parameters in augmenting constructors. Maybe we should just not do that. Each declared initializing formal does initialize a field, and you can't initialize the same field twice. Being an initializing formal is an implementation details.

Augmenting declarations can (and must) declare the parameter as a normal parameter, and access the value normally. They cannot declare it as an initializing formal with the same name, because that would be initializing the same field twice.

And then you should be able to write:

extension type E._(R _) {
  // Implicit: 
  // E._(R _) : <initialize representation value to first argument>

  E(R x) { print("first: $x"); };  // Does not initialize representation value. 
  augment E(this._) : assert(true) { print("last"); }; // Does now!
}

The fully augmented E constructor definition is:

A non-redirecting generative extension-type constructor definition with name E/E.new (whichever is considered canonical)
parameter list signature (R x)
that is known to initialize the representation value,
which, when executed with argument list A to initialize a value (defined by applying the augmenting declaration to the prior augmented definition D) with no existing initialized value.
- Perform "bind arguments to formals" for the parameter list (this._) with argument list A, giving parameter scope P and initializer list scope I.
  - This initializes the exension type value to that argument value v.
- Execute initializer list : assert(true) in initializer list scope I.
  - Success.
- Execute the definition D with argument list A with initialized value v.
  - Binds (E x) to A giving parameter scope P2 and initializer list scope I2.
  - Has no initializer list.
  - Is not augmenting, so no prior augmented definition to execute.
  - Is not a class, so not superclass constructor to call (but otherwise it would be here).
  - Executes body {print("first: $x");} in parmeter scope P2 with this bound to v.
  - Completes as having initialized value to v
- Comes back and executes {print("last");} in parameter scope P.
- Completes as having initialized value to v.

More formally:

An augmenting non-redirecting generative constructor declaration must not declare an initializing formal parameter or initializer list entry which initializes a variable (or extension type representation value) that is already initialized by the augmented definition. It may, and must, declare normal parameters for each parameter position of the augmented definition, with the same type. An augmenting declaration may declare a positional parameter with a different name (or no name, using _) than the name of the corresponding parameter in the augmented definition. This new name will be the local variable name for the parameter's argument value during in the parameter scope of the augmenting declaration (or in the initializer list scope, if the parameter is an initializing formal). The resulting definition from applying the augmentation will have the same parameter names as the augmented definition (and therefore as the original base declaration). The resulting definition initializes all values that the augmented definition initializes, and all variables that the augmenting declaration has an initializing formal or initializer list entry for.

So a non-redirecting generative constructor definition remembers which variables it has initialized, an augmentation cannot initialize them again, but it can initialize other uninitialized variables. Each augmenting constructor has its own parameter list, with its own local names for positional parameters if it wants to.

And you can, even if you usually shouldn't, use the same positional parameter to initialize more than one field:

@Reset()
class Box<T> {
  T value;
  Box(this.value);
}
// Added by @Reset macro, to reset all mutable fields to their initial value.
augment class Box<T> {
  final T initialValue;
  augment Box(this.initialValue);
  void reset() { value = initialValue; }
}

There is no problem with this declaration, it's indistinguishable from:

class Box<T> {
  T value;
  Box(T value) : value = value;
}
augment class Box<T> {
  final T initialValue;
  augment Box(T value) : initialValue = value;
  void reset() { value = initialValue; }
}

as it should be. Both declaration initialize both variables, so the class is valid. Individual parameter lists are treated individually.

I have no idea how super-parameters and augmentations should interact, though. I guess earlier declarations add super-parameters first, and later augmenting declarations append to that, and you don't repeat the super-parameters. (And you can't change the super-constructor of the initial declaration, unless perhaps it has none, and no super-parameters.)

jakemac53 commented 2 weeks ago

I agree this makes sense - and I think actually today the code that produces augmentation libraries would never write this.<param>, because it doesn't have access to whether it was a super parameter or initializing formal parameter etc, it can only see the type and name.

And you can, even if you usually shouldn't, use the same positional parameter to initialize more than one field:

You can't do this because it changes the name of the parameter - and even for positional parameters we do not allow that.

lrhn commented 2 weeks ago

Unless we do allow changing names. I see no strong reason not to. It's just local variables for the augmenting constructor itself.

jakemac53 commented 2 weeks ago

Unless we do allow changing names. I see no strong reason not to. It's just local variables for the augmenting constructor itself.

I do not see any value in it, I would rather just not allow it. And it makes it potentially harder to allow passing any argument by name in the future.

dart-lang / language

For extension type constructor augmentation, relax the rule about initializing formals? #4047