dart-lang / language

Design of the Dart language
Other
2.65k stars 201 forks source link

Bug: Analyzer (and maybe runtime?) not properly narrowing when using extensions #2879

Open tcf909 opened 1 year ago

tcf909 commented 1 year ago

Summary: I'm expecting to be able to narrow generics when using extensions. We are having to use extensions to properly identify a self return type of T1, but that needs to also come with the ability to narrow the generic within the Extension implementation.

I'm sure there is a valid reason why this is not occuring, but the allowed syntax (narrowing) and the implementation resulting from the syntax currently diverges. Either the syntax needs to be disallowed (which I hope is not the course of action) or the extension needs to correctly use the narrowed generic.

Version:

$dart --version
Dart SDK version: 2.18.5 (stable) (Tue Nov 22 15:47:29 2022 +0000) on "macos_arm64"

Reproduce:

main(){

  final otherClass = OtherClass();

  final test = TestClass();

  test.method(otherClass); // !!! This should be an analyzer + runtime error
}

class OtherClass {}

class OtherClassChild extends OtherClass {}

mixin TestMixin<T extends OtherClass> {}

class TestClass with TestMixin<OtherClassChild>{}

extension TestMixinExtension<T1 extends TestMixin<T2>, T2 extends OtherClass> on T1 {

  T1 method(final T2 should_be_narrowed_to_other_class_child){
    return this;
  }
}
lrhn commented 1 year ago

Having a type parameter which is not linked to any input type, is usually not going to give you anything useful. That's the problem here. The T2 type parameter is not restricted by the receiver type of the expression.

Dart extension resolution does not look at the parameters, so we get no clue to T2 from the argument of otherClass. The extension type parameters have to be solved entirely from the receiver's type.

The test.method(...) call checks whether the receiver is valid for the extension by trying to check whether TestClass is a valid subtype of the extension's on type T1. It tries to solve for TestClass <: T1.

The extension introduces the constraints: T1 <: TestMixin<T2> and T2 <: OtherClass, which must be satisfied by the solution.

The OtherClass <: T2 is not constrainted by that, and it occurs only covariantly in T1's bound, so we can safely choose OtherClass for `T2.

TestClass implements TestMixin<OtherClassChild>, so the first constraint reduces to TestMixin<OtherClassChild> <: TestMixin<T2>, and therefore OtherClassChild <: T2. Since we chose OtherClass for T2, that's satisfied, so we're done. Success, T1 = TestClass and T2 = OtherClass is a solution, and everyone's happy.

You wanted T2 to be OtherClassChild. It doesn't have to be, and the way it's currently implemented, it won't be. Choosing OtherClass is a valid solution.

Maybe we should choose the most precise type which can work, which is OtherClassChild here, but that requires changing the type inference algorithm. (Which is a language issue, not an implementation issue, if we assume the current implementation is working as intended.)

This behavior is not restricted to extension. You see the same for:

void foo<T extends List<S>, S extends Object?>(T list) {
  print(S);
}
void main() {
  foo(<int>[]);  // Prints "Object?"
}

It's still two type parameters where only one is directly restricted by the input value, and the other is indirectly restricted, and where choosing the bound is valid.

Compare it to:

void foo<T extends void Function(S), S extends Object?>(T f) {
  print(S);
}

void main() {
  void bar(int x) {}
  foo(bar);
}

where we completely fail to infer a solution (most likely because it doesn't infer a useful type for S, but our error messages are lacking).

@stereotype441 Would it be possible to do an ordering of the type variable inferencing, like we do for parameters, so parameters depending on other parameters are inferred after their dependencies, and can take the solution for those into consideration?

eernstg commented 1 year ago

We've discussed a particular way to provide the ability to decompose a type argument a number of times. This is an application of type patterns, dart-lang/language#170. We would then be able to write the following:

void foo<X extends Iterable<final Y>>(X iterable) => print(Y);

void main() => foo(<int>[]);  // Prints "int".

The basic idea is that we can introduce a type variable of a generic entity by having the syntax final <identifier> in a bound in a type variable declaration, like final Y above.

When the generic entity is used and actual type arguments are passed, a type variable like Y above is bound to the value which is determined by taking the actual type argument T, finding the superinterface of the form List<S> that T must implement, and binding Y to S.

One of the design parameters of this feature would be the computation of S: Does it come from compile-time information only, or is it looked up at run time based on the actual value of the type argument X? The approach where S is looked up at run time is obviously more powerful, but that needs to be weighed up against the cost (especially, if tree shaking is affected, or some other optimization may no longer be valid).

stereotype441 commented 1 year ago

@stereotype441 Would it be possible to do an ordering of the type variable inferencing, like we do for parameters, so parameters depending on other parameters are inferred after their dependencies, and can take the solution for those into consideration?

Yes, that would certainly be possible. Probably not a terribly easy undertaking, though--the code involved is fairly old and somewhat difficult to reason about.

I did some digging today and it looks to me like just adding an ordering to type variable inferencing wouldn't be enough. Something appears to be getting in the way of the code successfully performing this reasoning step:

TestClass implements TestMixin<OtherClassChild>, so the first constraint reduces to TestMixin<OtherClassChild> <: TestMixin, and therefore OtherClassChild <: T2

I'm not sure what the hang-up was. So part of the work involved in fixing this issue would be to figure out what's going wrong there 😃

tcf909 commented 1 year ago

Something definitely feels like a gap whether its language / implementation I can't really speak to it.

A better example of the analyzer not catching what will be a runtime issue:

main(){

  final otherClass = OtherClass();

  final test = TestClass();

  test.shared_data_set(otherClass); // !!! This should be an analyzer + runtime error

  final otherClassChild = test.shared_data_get();

}

class OtherClass {}

class OtherClassChild extends OtherClass {}

mixin TestMixin<T extends OtherClass> {

  T? shared_data;

}

class TestClass with TestMixin<OtherClassChild>{

  OtherClassChild? shared_data_get(){
    return shared_data;
  }

}

extension TestMixinExtension<T1 extends TestMixin<T2>, T2 extends OtherClass> on T1 {

  void shared_data_set(final T2 should_be_narrowed_to_other_class_child){
    shared_data = should_be_narrowed_to_other_class_child; // This throws
  }
}
tcf909 commented 1 year ago

And a bit of comic irony, the "```dart" function in the markdown does not like to color the thing that seems to be the problem:

image

eernstg commented 1 year ago

@tcf909, the core reason why we can have a type error at run time is the dynamically checked covariance. Here is a version of the example program where we use the feature that adds statically checked declaration-site variance to Dart (dart-lang/language#524):

// Using `--enable-experiment=variance`.

main() {
  final otherClass = OtherClass();
  final test = TestClass();

  /*
     test.sharedDataSet(
         otherClass); // Compile-time error: No such (extension) method.
     TestMixinExtension<TestClass, OtherClassChild>(test).sharedDataSet(
         otherClass); // Compile-time error: `OtherClassChild` expected.
     TestMixinExtension<TestClass, OtherClass>(test).sharedDataSet(
         otherClass); // Compile-time error: Type arguments violate bounds.
  */
  TestMixinExtension<TestClass, OtherClassChild>(test).sharedDataSet(
      OtherClassChild()); // OK.

  final otherClassChild = test.sharedDataGet();
}

class OtherClass {}

class OtherClassChild extends OtherClass {}

mixin TestMixin<inout T extends OtherClass> { // <-- Use declaration-site variance.
  T? sharedData;
}

class TestClass with TestMixin<OtherClassChild> {
  OtherClassChild? sharedDataGet() {
    return sharedData;
  }
}

extension TestMixinExtension<T1 extends TestMixin<T2>, T2 extends OtherClass>
    on T1 {
  void sharedDataSet(final T2 shouldBeNarrowed) =>
      sharedData = shouldBeNarrowed;
}

This illustrates several things:

This means that when we've concluded that the core of the type error problem is the dynamically checked covariance, the missing type inference is the main remaining issue.

Consider the example where we remove the non-covariant usage of T in the first place. We do not use the variance feature now, because there is no unsound covariance (OK, more precisely: unsound covariance is now only used with a private member, and we can do that safely because we can inspect all usages manually, or it's in another library and then they can't access the private member).

main() {
  final otherClass = OtherClass();
  final test = TestClass();

  test.sharedDataSet(
      otherClass); // OK.
  // TestMixinExtension<TestClass, OtherClassChild>(test).sharedDataSet(
  //     otherClass); // Compile-time error: `OtherClassChild` expected.
  TestMixinExtension<TestClass, OtherClass>(test).sharedDataSet(
      otherClass); // OK.
  TestMixinExtension<TestClass, OtherClassChild>(test).sharedDataSet(
      OtherClassChild()); // OK.

  final otherClassChild = test.sharedDataGet();
}

class OtherClass {}

class OtherClassChild extends OtherClass {}

mixin TestMixin<T extends OtherClass> {
  T? _sharedData;
  T? get sharedData => _sharedData;
  set sharedData(OtherClass? value) {
    if (value is T?) _sharedData = value;
    // Just ignore 'bad' values.
  }
}

class TestClass with TestMixin<OtherClassChild> {
  OtherClassChild? sharedDataGet() {
    return sharedData;
  }
}

extension TestMixinExtension<T1 extends TestMixin<T2>, T2 extends OtherClass>
    on T1 {
  void sharedDataSet(final T2 shouldBeNarrowed) =>
      sharedData = shouldBeNarrowed;
}

Without the variance feature, type inference comes up with the type arguments <TestClass, OtherClass>, and they are accepted (it is true that TestClass <: TestMixin<OtherClassChild> <: TestMixin<OtherClass>).

When variance is used, type inference cannot find the type arguments <TestClass, OtherClassChild> even in the case where the invocation is test.sharedDataGet(OtherClassChild()). This is basically because the type arguments associated with receivers of member invocations have no context.

C<X> whatever<X>() {
  print(X);
  return C<int>() as C<X>;
}

class C<X> {
  X get g => 1 as X;
}

void main() {
  int i = whatever().g; // Prints 'dynamic'.
}

We might assume that the context type int makes whatever().g infer as whatever<int>().g. However, that's not true, there is no context type for the receiver of a member invocation, and this means that we get whatever<dynamic>().g, and that causes the print(X) to print 'dynamic', and this is the same phenomenon that makes type inference attempt to use <TestClass, OtherClass> with an invocation like test.sharedDataSet(otherClass), and then there is no applicable extension method.

There is a proposal about providing some kind of context to receivers, but that proposal may still change, and it might not be adopted, and it hasn't been implemented at all (as far as I know).

In summary, we know how to make this safe, but we don't quite know how to make the type inference step succeed when it is done in a safe manner.

tcf909 commented 1 year ago

@eernstg I really appreciate your in depth explanation and examples.

tcf909 commented 1 year ago

I know I'm being redundant, but I continue to run in to the same issue over and over in implementing some libraries.

In general, I want to return the class itself in a type safe manner. The only real way I've found to do that is using extension unless you pass the self type down a chain in generics (which is troublesome for other reasons). In using an extension without narrowing I loose the ability for the implementing code to provide typesafe values passed via generics against the mixin class.

As I relook at the code examples I provided, I have to imagine the static analyzer has enough static information to narrow things. Isn't this a much simpler problem that we are making it?

 final test = TestClass(); // At this point I statically know the makeup of TestClass, the mixins it uses, the generics of those mixins, etc.

 test.shared_data_set(otherClass);  // At this point I statically know enough that I'm applying an extension to a class. I statically know which class, the makeup of the class, and the generic constraints of the class I'm applying the extension to.

If I know the makeup of TestClass statically, and I can apply an extension statically, isn't it just a matter of the static analyzer code going one level deeper and applying the constraints it discovers as it walks what I would assume is some sort of ast like tree containing the static information and apply the remaining code it is analyzing.

I'm not a PL guy, but it seems we would have enough static information to do the appropriate narrowing.

Kind of like how:

var whatever = 'abc';

whatever as int; // statically narrows

var runtimeErrorButCompiles = whatever.modPow();

Example of self return which is just another variation of the problem above:

main(){

  final otherClass = OtherClass();

  final test = TestClass().self_return(otherClass); // !!! This should be an analyzer + runtime error

}

class OtherClass {}

class OtherClassChild extends OtherClass {}

mixin TestMixin<T2 extends OtherClass> {

  void whatever(final T2 value){
    print(value);
  }
}

class TestClass with TestMixin<OtherClassChild>{

}

extension TestMixinExtension<T1 extends TestMixin<T2>, T2 extends OtherClass> on T1 {

  T1 self_return(final T2 should_be_narrowed_to_other_class_child){
    this.whatever(should_be_narrowed_to_other_class_child); // This ends up throwing a runtime error because TestClass T2 is OtherClassChild
    return this;
  }
}
lrhn commented 1 year ago

what I would assume is some sort of ast like tree containing the static information and apply the remaining code it is analyzing

That's now how type inference works. It's based on constraints solving, not trying to dig a solution out of the existing types. If your code allows multiple solutions, which is usually the case if you have type parameters that are not directly constrained by the context, it finds a solution, but not necessarily the solution you'd prefer.

  final otherClass = OtherClass();
  final test = TestClass().self_return(otherClass);

Here the satic type of otherClass is OtherClass, with superinterface Object. The static type of TestClass() is TestClass, which implements the superinterfaces TestMixin<OtherClassChild> and Object.

The invocation of self_return tries to match the receiver type TestClass against T1 with the constraints T1 extends TestMixin<T2> and T2 extends OtherClass

The only upper bound constraint on T2 is extends OtherClass. The constraints on T1 are TestClass <: T1 and T1 <: TestMixin<T2>.

Since T2 only occurs covariantly in T1 <: TestMixin<T2>, choosing the maximally valid valid for T2 is a good guess. So we let T2 be OtherClass. Choosing any subtype would only make it harder to find a solution to the constraints.

Then TestClass <: T1 and T1 <: TestMixin<OtherClass> can be solved in several ways. It chooses the minimal possible type here TestClass.

The solution is correct, TestClass <: TestClass, TestClass <: TestMixin<OtherClass> and OtherClass <: OtherClass.

It's not the one you want, but the constraints are underconstrained, and doesn't guarantee the solution you want, and the algorithm cannot guess which solution you want, so it just does its best to find a solution.

More precisely: You want to narrow T2 to a more precise type, but if one looks at the API of the extension method, T2 occurs contravariantly, so by not narrowing, it makes self_return more permissive. That could be a boon in other cases. (Not that the algorithm looks at the API at all, but it's an argument that the algorithm isn't necessarily doing something that's less optimal in all cases. It just doesn't work well in this case.)