dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.09k stars 1.56k forks source link

[Augmentations] No error in the analyzer when augmenting enum value whose name is `augmented` #56143

Open sgrekhov opened 2 months ago

sgrekhov commented 2 months ago
enum E {
  augmented;
}

augment enum E {
  augment augmented; // No expected error in the analyzer
//        ^^^^^^^^^
// [analyzer] unspecified
// [cfe] unspecified
}

See for example https://github.com/dart-lang/co19/blob/master/LanguageFeatures/Augmentation-libraries/augmented_expression_A06_t01_lib.dart

lrhn commented 2 months ago

Should there be an error?

The augmented identifier is only reserved inside the function body/expression of an augmenting declaration. That's the {...}, => ... body of a function, setter or getter, or the = ... initializer of a variable.

Anything outside of that should not treat the identifier as special.

I can see the spec says:

  • Augmenting enum values: When augmenting an enum value, augmented has no meaning and is not allowed.

That's misleading and should be fixed. The identifier augmented should just not not be reserved anwwhere in the syntax for augmenting an enum value, so it's just an identifier that means what the identifier otherwise means. There is no "code body" being replaced, so there is no code that can refer to a prior code body.

It should just be:

  • Augmenting enum values: When augmenting an enum value, augmented has no special meaning.

(It is interesting that void foo({Object? augmented = null}) {} is a declaration which cannot be augmented with cod that accesses that parameter. You can write:

augment void foo({Object? augmented = null}) {
  ... cannot access parameter here! ...
}

so you can protect one parameter from being accessed by augmentations. Hackortunity!)

eernstg commented 2 months ago

The error would be based on this rule from the augmentation specification:

A compile-time error occurs if a declaration with the basename augmented occurs in a location where the outermost enclosing declaration is augmenting.

In this rule, we would consider an <enumEntry> to be a declaration whose name is the specified identifier (the one before the period, if there is one), in this case augmented. I think that's reasonable. For instance, the meaning of an enum value is specified in terms of a transformation that gives rise to a static const variable with that name, which is definitely a declaration.

The rationale is that augmented should always have the special meaning that's a built-in property of augmentations when it occurs inside anything that begins with augment, it should not be a regular identifier. So we make it an error to declare anything named augmented because that's an implicit statement along the lines of "augmented is a regular identifier, and here is what it means!".

Similarly, a formal parameter (named or not) would also be a declaration, so we couldn't use the name augmented for that inside an augmenting declaration, either.

It might be possible to change the specification such that some of these occurrences of augmented are allowed, but it's not obvious to me that it would be an improvement.

lrhn commented 2 months ago

A compile-time error occurs if a declaration with the basename augmented occurs in a location where the outermost enclosing declaration is augmenting.

Sounds that be "innermost", not "outermost"? Or something more precise?

Otherwise

class C {
  void augmented() {}
  augment void augmented() {
    print("Yes we can");
  }
}

It's valid because the outermost declaration is C, which is not augmenting.

If this is intended to prevent

augment foo() {
  void augmented() => "helper";
  ...
}

that is, a declaration named augmented inside the body of an augmenting declaration, then it's not needed. Just making augmented a reserved word will do that.

I can't tell what the paragraph is trying to achieve, so I can't tell if it's worded as it should.

I would remove the rule, though, if we can't say what the benefit of it is

eernstg commented 2 months ago

Sounds that be "innermost", not "outermost"?

No, augmented was specified essentially as being a reserved word in a limited syntactic context, namely anywhere in any declaration with the modifier augment.

It is sufficient to check the outermost declaration because no declaration can be augmenting unless it is enclosed by another augmenting declaration, or it is itself outermost.

The augmentation specification never mentions the possibility that a non-augmenting type-introducing declaration can have an augmenting member declaration (which would necessarily come after a non-augmenting declaration of the same member in the same class body). That's probably an oversight. We might want to make it an error, or support it. If we do support it then it is necessary to check all enclosing declarations, and augmented would then denote the augmentation related feature if any enclosing declaration is augmenting.

The stated rule was introduced in order to maintain a reasonable level of readability, avoiding the tricky name binding structures that we would otherwise have in examples like the following:

void augmented() => print('Hello!');

class A {
  void foo() {}
}

augment class A {
  augment void foo() {
    augmented(() => augmented());

    void local() => augmented();
    augmented(local);
  }

  void bar() {
    augmented(() => augmented());
  }
}

The first occurrence of augmented in the body of augment foo should definitely be an invocation of the original declaration of foo in class A. However, a function literal serves as a declaration as well as an expression, so the second occurrence of augmented should invoke the top-level function. Similarly if () => augmented() is replaced by local where local is a local function that calls augmented().

The current specification says that augmented denotes the augmentation related feature everywhere in an augmenting declaration, even inside a nested declaration, which means that said second occurrence of augmented invokes the foo in class A. This seems both useful and unsurprising.

What's the workaround, in the two cases? Assuming that augmented has the augmentation related semantics only immediately inside the augmenting declaration, we can't write an abstraction (a local function or a function literal) that performs the invocation. We could use a late variable with an initializer, but that obviously only yields the correct semantics if it is guaranteed that it is evaluated at most once. (Also, this initializer might count as a nested declaration body, in which case we can't even abstract over augmented in that way.)

augment class A {
  augment void foo() {
    late helper1 = augmented();
    augmented(() => helper1; // Let's hope it is called at most once.

    late helper2 = augmented();
    void local() => helper2;
    augmented(local);
  }
  ...
}

I think it's fair to say that there is no workaround. It's much easier if augmented has the augmentation related semantics, and we want to refer to a top-level declaration named augmented:

void augmented() => print('Hello!');
const renamedAugmented = augmented;

augment class A {
  augment void foo() {
    augmented(() => renamedAugmented());

    void local() => renamedAugmented();
    augmented(local);
  }
  ...
}

In general, I'd like to suggest to developers that they should not have any declarations whose name is augmented. I don't think it's going to be a problem to do that, I see (except for code that is part of the implementation of the augmentation feature itself) just one occurrence in millions of lines of internal code where augmented is a declared name. That's a function literal (augmented) => augmented... which could just as well have been (anyName) => anyName....

In summary, it seems reasonable to generalize the feature to support augmenting member declarations immediately in the same class/mixin/extension/... body that also has the original declaration. But i do not think it's going to help anyone to change the rule such that augmented can be looked up in enclosing scopes from inside an augmenting declaration.

lrhn commented 2 months ago

I still prefer to only make augmented a reserved word inside the body of an augmenting method or variable initializer, which is where the augmented expression can occur, and not outside of that. The same way await and yield were only made reserved words inside an async or similar body.

I see no reason to reserve the word inside a non-augmenting method inside an augmenting class declaration. That will make the first A.bar above invalid, removing the workaround of calling an outside augmented-named member through a helper function.

eernstg commented 2 months ago

I still prefer to only make augmented a reserved word inside the body of an augmenting method or variable initializer

That's definitely a well-defined choice. However, I still think the following is very likely to be misread:

augment class A {
  void foo() => augmented();
}

// ... somewhere ...
void augmented() {}

Note that we're free to allow this usage later on if we make it an error now. If we do the opposite then it would be a breaking change to turn that occurrence of augmented into a reserved word (and, hence, an error).

await and yield were only made reserved words inside an async or similar body.

That's a little bit different because await and yield are concerned with the immediately enclosing function declaration whereas augmented should have the augmentation related semantics even in the case where it occurs more deeply nested.

For readability, I think it would be helpful to say "look at the enclosing declarations; if at least one of them has augment then augmented is the reserved word, and if none of them has it then augmented is a normal identifier".

lrhn commented 2 months ago

That example may be misread, but so would:

class A {
  // fifty other members here.
  void foo() => augmented();
}

// ... somewhere ...
void augmented() {}

It's not a bout whether the surrounding class is augmenting or not, it's about using a word that has a special meaning in some contexts outside of such a context.

Making it a globally reserved word is the best solution for readability. And worst for breakage.

We can choose to make augmented a reserved in all possible new contexts, any code that couldn't exist before the augmentation feature, to make it as close to globally reserved as we can without breaking old code.

That's not necessarily better than a more focused reservation, though. As the example here shows, it makes

class A {
  void foo() => augmented();
}

and

augment class A {
  void foo() => augmented();
}

behave differently, even though the foo method has nothing to do with augmentation.

I think making it a reserved word inside the bodies of augmenting declarations, where the word can actually have its new meaning, is easier to remember and recognize, because the keyword that toggles the meaning is as close by as possible. You can write a non-augmenting method the same way, whether you write it inside an augmenting or non-augmenting class declaration. If you write an augmenting member declaration, then the restriction on augmented applies, because that's where the word actually means something new. (Should consider whether it applies to the parameter list too. I'd say "no", otherwise you can't augment a method with a named parameter named augmented. You still can't access it inside the augmenting declaration. We should embrace publification of private names, so you can call it _augmented then, and have a public parameter name that is augmented and a local variable name which is _augmented.)

Another option is a completely different and previosuly unsupported syntax. I've suggested foo.super when augmenting foo, but that doesn't work well of operators. Or just super, and then you have to write ClassName.super.foo to access the super-class foo member. That's probably confusing when normal super can be used for anything, but augmented super can only be used for one member, and is called in a different way. So no great options here. #all_the_good_syntaxes_are_taken!)