dart-lang / language

Design of the Dart language
Other
2.66k stars 204 forks source link

Type promotion for `this` #1397

Open eernstg opened 3 years ago

eernstg commented 3 years ago

The reserved word this, when defined, denotes the current instance of the enclosing class, and hence it is known that if this is T yields true then it is sound to promote the static type of this as if this had been a local variable, and similarly for all other situations where the local variable would be promoted.

The promotion of this implies that this gets a different static type (when it is written explicitly, and when it is an implicit part of an instance member access).

class A {
  int a;
  void m() {
    if (this is B) {
      B bb = this; // OK because `this` has type `B`.
      a = b; // OK because it means `this.a = this.b;`.
    }
  }
}

class B extends A {
  int b;
}

However, it is also possible to learn more about type arguments:

class A<X> {
  X x;
  A(this.x);
  void m() {
    if (this is A<int>) {
      x.isEven;
      Iterable<num> xs = [x];
    }
  }
}

class Pair<X, Y> {
  X x; Y y;
  Pair(this.x, this.y);
  void m() {
    if (this is Pair<Y, X>) {
      var tmp = x;
      x = y;
      y = tmp;
    }
  }
}

For simplicity, we could make the choice to support promotion of this with no changes to the treatment of the type parameters of the enclosing class, or with support as an option that we may pursue in the future.

Presumably, it is a nearly non-breaking change to add support for promotion of type parameters, because this only adds more information about said type parameters, it doesn't invalidate existing information. (The change is not completely non-breaking because any change of a static type may give rise to inference of type arguments with different values, and they are reified so the change is observable.)

mateusfccp commented 3 years ago

Please, I struggle with this every time I have to implement a sum type.

eernstg commented 3 years ago

@mateusfccp, is it correct to assume that you need this to have the promoted type, but you do not need the derived promotions of type parameters?

mateusfccp commented 3 years ago

Sorry, @eernstg, I couldn't exactly understand what you meant.

Usually when I make a sumtype I have to make a fold function, where I have to check for each subtype. However, I have to use a placeholder for this, as this won't be promoted.

T fold<T>(...) {
  final _this = this;
  if (_this is A) {
     // ... do something with promoted _this
  } else if (_this is B) {
     // ... do something with promoted _this
  }
}

Alternatively, I use this in the comparisons and cast inside the block, similar to what you did in the first example.

eernstg commented 3 years ago

@mateusfccp, the simplest approach to promotion of this would give you exactly the type for this that you currently get with _this, with no connection to type parameters.

munificent commented 3 years ago

I absolutely love this:

class A {
  void m() {
    if (this is B) {
      B b = this; // OK because `this` has type `B`.
    }
  }
}

class B {}

But I think this is very much a bridge too far:

class A {
  void m() {
    if (this is B) {
      b = 1; // "b" is now in scope because of the promoted implicit `this`.
    }
  }
}

class B extends A {
  int b;
}

I don't think type promotion should interact with the scoping of bare identifiers like this. Say you have a program like:

var name = "top level";

class A {
  m() {
    print(name);
  }
}

class B extends A {
  var name = "in B";
}

main() {
  A().m();
}

This prints "top level". If we used the promoted this for resolving bare identifiers and you change the m() to:

  m() {
    if (this is B) print(name);
  }

Now name resolves to this.name and it prints "in B". That feels pretty surprising to me.

In other words, I think it makes sense to promote the identifier this as if it were a local variable, but not the entire notion of the "current receiver".

eernstg commented 3 years ago

@munificent wrote:

... a bridge too far ...

to let a plain identifier id resolve as this.id where id is only present in the interface of this because of the promotion, and continued

This prints "top level". If we used the promoted this for resolving bare identifiers and you change the m() to:

m() {
  if (this is B) print(name);
}

We actually wouldn't ever re-interpret an identifier like name in this manner: The lexical scope always wins, and the ability to consider id as an abbreviated notation for this.id only kicks in when there is no declaration named id in scope at all.

So we maintain the sanity property that we have today: this will never capture a name which is visible in an enclosing scope.

But it would of course still be possible to use an explicit this, and I suppose that's less controversial:

m() {
  if (this is B) print(this.name); // Prints "in B".
}
lrhn commented 2 years ago

(Actually, doing if (this is B) print(this.name); will likely get you yelled at by the unnecessary_this lint. Not one of my favorite lints.)

One of the places where I most often want to promote this is inside extension methods, which is also a place where this behaves more like a local variable anyway. I think we would be better off treating this as a (final, implicit) variable in every way, like we already allow it in interpolations, "$this", even if it's not actually an identifier.

stereotype441 commented 1 year ago

FYI, I'm considering adding support for this promotion to language feature inference-update-2, which currently adds support for promotion of private final fields (https://github.com/dart-lang/language/issues/2020).

eernstg commented 1 year ago

@stereotype441, what's your take on the indirect consequences of promoting this? We have some direct consequences which are exactly the same as what we would have by (1) declaring final self = this;, (2) promoting self as usual, and (3) replacing every explicit this by self in the scope of self.

One consequence is that T t = this; may now succeed where it would otherwise have been a compile-time error, because the static type of this is assignable to T because of the promotion. (With self, we'd have T t = self;, and self would have been promoted just like any other local variable.)

However, we also have rules like this one that give rise to indirect consequences:

When the lexical lookup of id yields nothing, i is treated as the ordinary method invocation this.i.

In this rule i is an 'unqualified invocation' of id, that is, something like id<T1, T2>(some, args). There are similar rules for other syntactic constructs where lexical lookup is used.

The important part is that this rule introduces this implicitly, based on the failure to find a declaration named id, or the fact that id is declared in an enclosing scope, but it is an instance member of the enclosing class. This implies that we'd accept the following:

class A {
  m() {
    if (this is B) print(name);
  }
}

class B extends A {
  var name = "in B";
}

The reason why this is allowed is not that we're introduced some funky scope rules, it is simply that (1) print(name) becomes print(this.name) because there is no name in scope, and (2) this has type B, and the interface of B contains a member named name.

Another interesting situation arises because a promotion of this makes one or more extensions applicable. This could enable the invocation of an extension member that wasn't otherwise applicable, or it could introduce a conflict among a set of multiple extension members where we had just one before the promotion. Again, we could see this phenomenon with an explicit this, and it would be very similar to a promotion of a local variable; but we could also see it with an implicitly introduced this, in which case it might be considered less comprehensible.

I also mentioned the consequences for type variables: In a class C<X> ..., if this is promoted to C<int> then we could conclude that X <: int and allow List<int> xs = <X>[];, etc. Presumably, we could simply leave the type variables unchanged even in the case where a promotion of this has implications for their value.

lrhn commented 1 year ago

I'd be fine with an unqualified name being looked up in the surrounding class, not the static type of this. I think that's what's already specified, it just didn't matter before. We're resolving identifiers, that should happen before type promotion.

It means that if the surrounding class has no member, but the promoted this type does, then a plain foo is an error, but this.foo works. I'm OK with that, but it may be confusing to some users which do not distinguish the two syntaxes.

For extension declarations, I'd do the same. An unqualified identifier which resolves to a surrounding instance member declaration, is involved on the same extension as the current invocation. That means the same extension, same type arguments and the same receiver. It does not use this for anything (other than possibly referencing the receiver). That is:

extension E<T extends num> on T {
  T max(T other) {
    if (this is double) {
      if (this.isNaN) return this;
      if (other is! double) return max(other.toDouble());
    } else if (other is double) {
      if (other.isNaN) return other;
      return (this.toDouble() as T).max(other);
    }
    return this < other ? other : this;
  }
}

Here we promote this to double, then call the unqualified max in scope. That's specified to be equivalent to

  E<T>(this).max(...)

and not to this.max(...).

Again I think that's fine, and we should just keep it

eernstg commented 1 year ago

I think that's what's already specified, it just didn't matter before.

That is not quite true, as I mentioned here. But it is true that it did not matter before.

We're resolving identifiers, that should happen before type promotion.

Arguably, we aren't resolving plain identifiers (or operators) that end up denoting instance members that aren't declared in a lexically enclosing scope. We are detecting that those identifiers/operators are not in scope, and then we transform the construct to make it an explicit invocation on this (actually also when they are in scope), and then we proceed to run static analysis on that. This might yield an instance member invocation, or an extension method invocation, or a compile-time error.

If we wish to stop specifying that m() means this.m() then we'd need some other specification which ensures that the invocation m() resolves to a declaration that isn't lexically in scope (it may or may not be), and at run time it will have a suitable binding of this.

Surely it's doable, but it's not a no-op.

stereotype441 commented 1 year ago

@eernstg asked:

@stereotype441, what's your take on the indirect consequences of promoting this? We have some direct consequences which are exactly the same as what we would have by (1) declaring final self = this;, (2) promoting self as usual, and (3) replacing every explicit this by self in the scope of self.

One consequence is that T t = this; may now succeed where it would otherwise have been a compile-time error, because the static type of this is assignable to T because of the promotion. (With self, we'd have T t = self;, and self would have been promoted just like any other local variable.)

This one seems fine to me.

However, we also have rules like this one that give rise to indirect consequences:

When the lexical lookup of id yields nothing, i is treated as the ordinary method invocation this.i.

In this rule i is an 'unqualified invocation' of id, that is, something like id<T1, T2>(some, args). There are similar rules for other syntactic constructs where lexical lookup is used.

The important part is that this rule introduces this implicitly, based on the failure to find a declaration named id, or the fact that id is declared in an enclosing scope, but it is an instance member of the enclosing class. This implies that we'd accept the following:

class A {
  m() {
    if (this is B) print(name);
  }
}

class B extends A {
  var name = "in B";
}

The reason why this is allowed is not that we're introduced some funky scope rules, it is simply that (1) print(name) becomes print(this.name) because there is no name in scope, and (2) this has type B, and the interface of B contains a member named name.

IMHO this would be reasonable.

Another interesting situation arises because a promotion of this makes one or more extensions applicable. This could enable the invocation of an extension member that wasn't otherwise applicable, or it could introduce a conflict among a set of multiple extension members where we had just one before the promotion. Again, we could see this phenomenon with an explicit this, and it would be very similar to a promotion of a local variable; but we could also see it with an implicitly introduced this, in which case it might be considered less comprehensible.

I'm ok with promotion of this making an extension applicable, and I'm even ok with promotion of this leading to a conflict among extension members. In my mind it's better to make implicit and explicit this. behave as similarly as possible, even if this leads to an error in this rare circumstance.

I also mentioned the consequences for type variables: In a class C<X> ..., if this is promoted to C<int> then we could conclude that X <: int and allow List<int> xs = <X>[];, etc. Presumably, we could simply leave the type variables unchanged even in the case where a promotion of this has implications for their value.

I would love for if (this is C<int>) { List<int> xs = <X>[]; } to work someday. But I suspect that the situation arises pretty rarely, so for pragmatic reasons I don't want to try to add support for it now. The analyzer and CFE don't track type variables in the same way they track local variables, and they don't currently have any hooks to allow flow analysis to modify their semantics when this is promoted. So IMHO, making this work would be a lot of extra effort for only marginal benefit.

@lrhn said:

I'd be fine with an unqualified name being looked up in the surrounding class, not the static type of this. I think that's what's already specified, it just didn't matter before. We're resolving identifiers, that should happen before type promotion.

That doesn't match my memory of what's currently specified. My understanding is that an unqualified name is first looked up in the lexical scope. Then:

For example:

var x3 = 'top level x3';
class B {
  get x3 => 'B.x3';
  get x4 => 'B.x4';
}
class C extends B {
  get x1 => 'C.x1';
  get x2 => 'C.x2';
  f() {
    var x1 = 'local variable x1';
    print(x1); // prints 'local variable x1', due to lexical lookup
    print(x2); // lexical lookup finds prints 'C.x2'; this is virtually dispatched, so it prints 'D.x2'
    print(x3); // prints 'top level x3', due to lexical lookup
    print(x4); // lexical lookup fails, hence equivalent to `print(this.x4);`, hence prints 'B.x4'
  }
}
class D extends C {
  get x2 => 'D.x2';
}
main() {
  D().f();
}

So when you say "I'd be fine with an unqualified name being looked up in the surrounding class, not the static type of this", if you mean that you'd want print(x4) to become a compile error, I think that would be too breaking.

It means that if the surrounding class has no member, but the promoted this type does, then a plain foo is an error, but this.foo works. I'm OK with that, but it may be confusing to some users which do not distinguish the two syntaxes.

For extension declarations, I'd do the same. An unqualified identifier which resolves to a surrounding instance member declaration, is involved on the same extension as the current invocation. That means the same extension, same type arguments and the same receiver. It does not use this for anything (other than possibly referencing the receiver). That is:

extension E<T extends num> on T {
  T max(T other) {
    if (this is double) {
      if (this.isNaN) return this;
      if (other is! double) return max(other.toDouble());
    } else if (other is double) {
      if (other.isNaN) return other;
      return (this.toDouble() as T).max(other);
    }
    return this < other ? other : this;
  }
}

Here we promote this to double, then call the unqualified max in scope. That's specified to be equivalent to

  E<T>(this).max(...)

and not to this.max(...).

Again I think that's fine, and we should just keep it

Agreed that this max extension method should still work. I think if we stick with the rule that lexical scope takes precedence, this will work out fine.

Speaking for myself, I want to make implicit and explicit this. behave as similarly as possible. Partly because think it matches how I intuitively think of Dart. Partly because we have a lint suggesting that users remove unnecessary this., and I would hate to complicate the notion of when this. is "unnecessary". So I would advocate for something like this:

eernstg commented 1 year ago

So I would advocate for something like this:

That would be "no changes", with this considered to have the promoted type in the resulting member access. I would be happy about that, too. ;-)

lrhn commented 1 year ago

Ack, I was misremembering what we do for an identifier that it's not in the lexical scope. I thought we first checked whether the surrounding class has a member with that name (not class declaration, so including inherited members), before deciding whether to turn it into this.member. We don't, if it's an instance method (or other this-accessible scope we assume it must mean that, which is also why extension methods can be called with an implicit this.

In which case keeping that behavior is probably safest.

And an identifier resolving to an extension member of the current extension has a potentially different meaning than prefixing with this. Likely only allowing this.foo to target another extension when this is promoted.

Maybe we should just say that if an extension is applicable to an invocation inside itself, it's always most specific (based on location). You can always use an explicit invocation if you mean something else. (Maybe extensions defined in the same library should always be more specific.)

eernstg commented 8 months ago

@mkustermann mentioned the situation where this occurs in an extension declaration in this issue.

Moreover, as @lrhn mentioned here, this is quite similar to a formal parameter or a local variable when it occurs in an extension declaration.

If this in an extension declaration turns out to be easier to handle than this in a class/mixin/enum declaration then we could choose to deal with those two cases separately.

Also note that we already have a similar treatment of the representation variable in an extension type when its name is private (which is relevant because the representation variable in an extension type plays a role which is quite similar to the role played by this in an extension):

extension type E(num _n) {
  void foo() {
    if (_n is int) {
      _n.isEven; // OK, `_n` has type `int` here.
    }
  }
}
lrhn commented 8 months ago

To summarize, also for my own benefit:

Classes and extension types behave mostly the same:

The issues are pretty much the same. Extensions have one case where they're unaffected by this-type, because a call on an instance member of itself is not a this... invocation, but an Ext<T>(this)... invocation - to ensure that it calls the member the name lexically resolved to. We could change the invocation to Ext(this)... and infer the type variables again for a promoted this, that would be consistent with the behavior inside type declarations.

The biggest issue, IMO, is that it just doesn't work for generics. That's a pretty big caveat too.

We can't promote a generic var self = this; any better than we can this. One has to upcast first, getting rid of the T before promoting.

class C<T> {
  T? get value => null;
  void foo() {
    var self = this;
    if (self is C<num>) {
      num? v = self.value; // Not allowed. Not promoted to `C<num>` or `C<T&Num>`.
    }
    C<Object?> self2 = this;
    if (self is C<num>) {
      num? v = self2.value; // OK, did promote.
    }
  }
}

Most of the other issues are not new. They happen just the same if promoting variables. For the implicit this. insertion, we could choose to insert (this as D<T>) instead, the self-reference at its unpromoted type, if we are afraid of the promotion changing typing of implicit self calls. But I'd be worried that that's more confusing in the long run.

eernstg commented 8 months ago

That's a great summary!

@lrhn wrote:

Promoting "T itself" to be (known to be) bounded by int ("type variable promotion") would probably be sound (this is D<int> implies D<T> <: D<int>, implies T <: int), but that'd be a completely new feature.

Certainly, that's the main reason why I created this issue in the first place. It's interesting. ;-)

It's worth noting that we can already write code where a type variable is promoted, but it takes a couple of steps:

[Edit: Now returning a value whose type depends on X.]

// ignore_for_file: non_constant_identifier_names, unused_local_variable

extension _AIntExtension<X extends int> on A<X> {
  List<X> fooAInt(String s) {
    print('Receiver is $s: Stuff which is only applicable when `X <: int`.');
    A<int> v1 = this; // OK.
    A<X> v2 = this; // OK.
    int v3 = x; // OK.
    return [x];
  }
}

class A<X> {
  X x;
  A(this.x);

  List<X> foo(String s) {
    late List<X> result;
    print('Receiver is $s: Stuff that works for all values of `X`.');
    if (this is A<int>) {
      // We need a dynamic invocation to make the type variable bounds
      // check occur at run-time. NB: It is guaranteed to succeed, but the
      // type checker doesn't understand that.
      void typeVariableCaster(List<Y> Function<Y extends int>(A<Y> self) g) {
        result = (g as dynamic)<X>(this);
      }
      typeVariableCaster(
          <Z extends int>(A<Z> self) => _AIntExtension<Z>(self).fooAInt(s));
    } else {
      result = [];
    }
    return result;
  }
}

String typeName(Object? o) => o.runtimeType.toString();

void main() {
  A<int> a1 = A(42);
  A<String> a2 = A('Hello!');
  print(a1.foo(typeName(a1)));
  print(a2.foo(typeName(a2)));
}
lrhn commented 8 months ago

That's a clever way to get the current binding of a type variable bound to a type variable with a more restrictive upper-bound, but it's not what I'd call type variable promotion. It's introducing a second type variable, which is not assignable to the first. If fooAInt returned (its) X, then that return value would not be assignable to the X of class A<X>.

Actual type variable promotion, as I use the term, would mean something like:

class C<X extends num> {
  void foo(X value) {
     if (this is C<int>) { // Assume this promotes `X` from `extends num` to `extends int`.
       int v2 = value; // `X` is now a subtype of `int`!
     }
}

That's what this is C<int> needs to do in order to be able to promote this at all, because C<int> is not a subtype of C<X>, so can never promote C<X> to C<int>. But if we can promote X to X extends int, then we also promote C<X> to be a subtype of C<int>, making this have type C<int> while still having type C<X>.

eernstg commented 8 months ago

it's not what I'd call type variable promotion

So true, I was just confirming that we can achieve the desired typing situation where X has the precise value (like in the body of class A<X> ...), where X <: int is known, and where this has type A<X>.

In particular, we have to say (g as dynamic)<X>(this) because the type checker won't recognize that there is a connection between the type parameter of A and the type parameter of AIntExtension.

I've adjusted the example such that foo has a return type that depends on X. However, it doesn't matter much: We just change the dynamic invocation slightly, and it now has one more dynamic type check (namely: that the returned object is a List<X>). It's all caused by the fact that the type checker doesn't understand that those two type variables have the same value.

I'm sure we can find other reasons why this emulation isn't entirely as powerful as a real type parameter promotion, but that's just one more argument why we might consider having the real thing.

Conversely, you could also say that the emulation technique is worth having in mind, if it is needed once in a blue Moon. ;-)