dart-lang / language

Design of the Dart language
Other
2.67k stars 205 forks source link

Type Patterns #170

Open eernstg opened 5 years ago

eernstg commented 5 years ago

In response to #169, this is a proposal for a mechanism called type patterns. A type pattern P may contain certain syntactic elements that introduce a new type variable (look for var), and it supports a check for whether a given type matches the pattern P. It is needed for various kinds of class extensions / extension methods under consideration.

The following spells out what a type pattern is and how it works, to a level of detail which is intended to support the assumption that the concept can be given a precise definition and that usages of this concept can be sound.

Examples

The basic construct defined in this proposal is a type patterns clause, which is a comma separated list of type patterns. Here are some examples, one per line, along with some comments giving hints about the meaning of the given construct.

// Match a single, fixed type with a type pattern.
int
List<int>
void Function()

// Match several fixed types with a type patterns clause.
A, B
Iterable<int>, List<dynamic>

// Match any type, and bind `X` to it.
var X

// Match any type `T` which satisfies `T <: C<T>`, and bind `X` to it.
var X extends C<X>

// Match any type of the form `Iterable<T>` and bind `X` to `T`;
// also, e.g., match `List<T>` and bind `X` to `T`.
Iterable<var X>

// Match any type of the form `T Function()` and bind `X` to `T`;
// also, e.g., match `T function([int])` and bind `X` to `T`.
var X Function()

// Match any type of the form `T Function(S, [int])`,
// binding `X` to `T` and `Y` to `S`;
// also, e.g., match `num Function(double, [num, num])`,
// binding `X` to `num` and `Y` to `double`.
var X Function(var Y, [int])

// Match `List<int>`, binding `X` to `int`; do not match `List<String>`.
Iterable<var X extends num>

// Error: cannot introduce two type variables with the same name
// in one type patterns clause.
var X, List<var X>

// Match `Map<num, num>` and bind `X` to `num`;
// match `Map<num, int>` and bind `X` to `num`;
// do not match `Map<int, num>`.
Map<var X, X>

// Error if subtype robustness required (e.g., for extension methods):
// Cannot declare a match bound in a contravariant position.
extension A on void Function(var X extends num) { ... }

First note that every type is also a type pattern, so it is always possible to use any particular type as a pattern. The intuition behind this is that it matches the specified type.

Next, when a type is a subtype of a type pattern which is also derivable from <type>, it also matches. So int matches num, and all types match dynamic.

Finally, a type patterns clause containing primitive type patterns (that is, where var occurs) matches the types as described in the comments above, which generally means that T matches P when it is possible to extract a value for each type variable introduced by a primitive type pattern in P which satisfies the bounds, if any, and then T matches the result.

Syntax

This proposal extends the Dart grammar with the following rules:

<typePatterns> ::= <typePattern> (',' <typePattern>)*

<typePattern> ::= <functionTypePatternTails>
  | <typePatternNotFunction> <functionTypePatternTails>
  | <typePatternNotFunction>

<functionTypePatternTails> ::= <functionTypePatternTail> <functionTypePatternTails>
  | <functionTypePatternTail>

<typePatternNotFunction> ::= <typePatternNotVoidNotFunction>
  | 'void'

<functionTypePatternTail> ::=
  'Function' <typePatternParameters>? <parameterTypePatternList>

<typePatternNotVoidNotFunction> ::= <typeName> <typePatternArguments>?
  | 'Function'
  | 'var' <typeIdentifier> ('extends' <type>)?

<typePatternParameters> ::=_T<sub>j</sub>
  '<' <typePatternParameter> (',' <typePatternParameter>)* '>'

<parameterTypePatternList> ::= '(' ')'
  | '(' <normalParameterTypePatterns> ',' <optionalParameterTypePatterns> ')'
  | '(' <normalParameterTypePatterns> ','? ')'
  | '(' <optionalParameterTypePatterns> ')'

<typePatternArguments> ::= '<' <typePatternList> '>'

<typePatternParameter> ::=
  <metadata> <typeIdentifier> ('extends' <typePatternNotVoid>)?

<normalParameterTypePatterns> ::=
  <normalParameterTypePattern> (',' <normalParameterTypePattern>)*

<optionalParameterTypePatterns> ::= <optionalPositionalParameterTypePatterns>
  | <namedParameterTypePatterns>

<typePatternList> ::= <typePattern> (',' <typePattern>)*

<typePatternNotVoid> ::= <functionTypePattern>
  | <typePatternNotVoidNotFunction>

<normalParameterTypePattern> ::= <patternTypedIdentifier>
  | <typePattern>

<optionalPositionalParameterTypePatterns> ::=
  '[' <normalParameterTypePatterns> ','? ']'

<namedParameterTypePatterns> ::=
  '{' <patternTypedIdentifier> (',' <patternTypedIdentifier>)* ','? '}'

<functionTypePattern> ::= <functionTypePatternTails>
  | <typePatternNotFunction> <functionTypePatternTails>

<patternTypedIdentifier> ::= <typePattern> <identifier>

We use the phrase type patterns clause for terms derived from <typePatterns>, and type pattern for terms derived from <typePattern>.

Note that there is currently no way in the Dart grammar to use a type pattern or a type patterns clause; it is up to other extensions of Dart to introduce syntactic locations where these constructs can occur.

Static Analysis

Every occurrence of a type alias F<T1, .. Tk> in a pattern is replaced by the expansion [T1/X1, .. Tk/Xk]B, where B is the body of F and Xj, 1 <= j <= k are the type parameters of F. A non-generic type alias is covered by the case k = 0.

Let Ps be a type patterns clause. For any <typeIdentifier> X, it is a compile-time error if Ps contains two or more primitive type patterns introducing X. Otherwise we say that Ps introduces the set of type variables that are introduced by primitive type patterns in Ps.

The variance of a position in a pattern is declared in the same way as for positions in a type. It is a compile-time error if a primitive type pattern with a bound occurs in a contravariant position. It is a compile-time error if a primitive type pattern occurs in an invariant position. (We could make the latter a syntax error, but then the expansion of type aliases could introduce syntax errors, so it's simpler to allow the syntax and make it a non-syntax compile-time error.)

Let Ps of the form P1, .. Pk be a type patterns clause that introduces the type variable X in Pj for some j in 1..k; it may also introduce other type variables in any of its type patterns. Let T be a type. The type S is the type that corresponds to X in Ps if and only if S is the type that corresponds to X in Pj.

The notion of a type that corresponds to a type variable introduced by a type pattern is defined in terms of the rules stated below. We need the special placeholders Up and Down which will be used during the computation of corresponding types, but which will never occur in the resulting bindings.

Consider the situation where we wish to determine whether a type pattern P matches a given type T. The first step taken is to substitute Down for every covariant occurrence of Null in T, and Up for every contravariant occurrence of Null in T, yielding the term t. In the following we will consider Up and Down as names of types. (So Iterable<Up> is a superinterface of List<Up>, etc.) Subsequent steps are specified in the following rules:

Let P be a type pattern of the form var X or var X extends S. In this case X corresponds to Object with respect to (P, Up) (note that the bound cannot exist in this case because P occurs in a contravariant position), and X corresponds to Null with respect to (P, Down). Let P be a type pattern of the form var X or var X extends S, and let t be a term different from Up and Down. In this case X corresponds to [Null/Up, Null/Down]t with respect to (P, t).

Let P be a type pattern of the form C<P1, .., Pj, .., Pn> where Pj is a type pattern that introduces X. In this case X corresponds to U with respect to (P, Up) if X corresponds to U with respect to (Pj, Up); X corresponds to U with respect to (P, Down) if X corresponds to U with respect to (Pj, Down).

Let P be a type pattern of the form C<P1, .., Pj, .., Pn> where Pj is a type pattern that introduces X, and let t be a term which has a direct or indirect superinterface of the form C<t1, .., tj, .. tn> (note that t then cannot be Null, Up, or Down). In this case X corresponds to U with respect to (P, t) if X corresponds to U with respect to (Pj, tj).

Note that the match cannot be with respect to (P, Null), because this implies that the original pattern had an invariant occurrence of a primitive pattern introducing X, and that's an error. The same argument applies in the following cases, which is the reason why matching with respect to (P, Null) is not mentioned.

Let P be a type pattern of the form P0 Function<...>(...) where P0 is a type pattern that introduces X. In this case X corresponds to U with respect to (P, Up) if X corresponds to U with respect to (P0, Up), and X corresponds to U with respect to (P, Down) if X corresponds to U with respect to (P0, Down). Let P be a type pattern of the form P0 Function<...>(...) where P0 is a type pattern that introduces X, and let t be a term of the form s Function<...>(...). In this case X corresponds to U with respect to (P, t) if X corresponds to U with respect to (P0, s). In all cases, matching has failed if the corresponding type has a free occurrence of one of Y1..Ys.

Note that non-generic function types are handled by the special case where <...> declares zero type parameters, in which case it is omitted. We do not require that the elided parts of the function types have any particular relationships with each other, because we will apply a subtype check afterwards in order to ensure that matching only succeeds when the types are appropriately related. Similar considerations apply for similar situations below.

Let P be a type pattern of the form P0 Function<Y1 extends B1, .. Ys extends Bs>(..., Pj, ...) where Pj is a type pattern that introduces X. In this case X corresponds to U with respect to (P, Up) if X corresponds to U with respect to (Pj, Down), and X corresponds to U with respect to (P, Down) if X corresponds to U with respect to (Pj, Up). In all cases, matching has failed if the corresponding type has a free occurrence of one of Y1..Ys.

Let P be a type pattern of the form P0 Function<Y1 extends B1, .. Ys extends Bs>(..., Pj, ...) where Pj is a type pattern that introduces X, and let t be a term of the form t0 Function<Y1 extends B1, .. Ys extends Bs>(..., tj, ...). In this case X corresponds to U with respect to (P, t) if X corresponds to U with respect to (Pj, tj). In all cases, matching has failed if the corresponding type has a free occurrence of one of Y1..Ys.

(TODO: Spell out treatment of optional parameters for all forms of function type. Double-check the approach to generic function types.)

Let Ps of the form P1, .. Pk be a type patterns clause introducing type variables X1 .. Xs, and T a type. T matches Ps if and only if there exist types T1 .. Ts such that, for each j in 1..s, Tj satisfies its bound, if any, and Xj corresponds to Tj with respect to (Ps, T), and T is a subtype of [T1/X1 .. Ts/Xs]Pj for each j in 1..k.

Note that "exist types" does not imply that a costly search must be performed: Each step in the matching algorithm proceeds with a subterm of the given types being matched up (except for the super-interface step, where we may need to traverse the whole superinterface graph in order to find a type which is sufficiently similar to the pattern that it is being matched against), and then type variables are bound basically by being "looked up". If that process succeeds then we have performed an amount of work that is similar to a subtype check. If it fails then such types do not exist.

Subtyping Property

We say that a type patterns clause Ps is subtype robust when none of the type variables introduced by Ps occur except in the type pattern that introduces it, no primitive type pattern occurring in a contravariant position has a bound, and no primitive type pattern occurs as the return type or as a parameter type of a generic function type.

For example, var X and Map<var X, var Y> are subtype robust. But Map<var X, X> is not, because X occurs twice; this destroys subtype robustness because we could match statically with Map<num, num>, binding X to num statically, and the run-time type could then be Map<int, num>, which is a subtype of Map<num, num> but which fails to match Map<var X, X>.

Similarly, A<var X>, B<var Y extends X> is not subtype robust because X occurs twice; this destroys subtype robustness because the run-time type could implement A<int> and B<num> (which means that matching will fail), but the static type could still have A<num> and B<num> as superinterfaces.

Finally, Function(var X extends num) is not subtype robust because X has a bound and occurs contravariantly. This breaks subtype robustness because the static type could be Function(num) and the run-time type could be Function(Object), and this fails to match because X <: num would be violated.

We consider the following property to be likely provable: Assume that Ps is a subtype robust type patterns clause and T is a type such that Ps matches T, binding its type variables X1 .. Xs to types T1 .. Ts; assume that S is a subtype of T; then Ps matches S, binding its type variables to S1 .. Ss, and for each j it is guaranteed that if Xj occurs covariantly in Ps then Sj <: Tj, if Xj occurs contravariantly in Ps then Tj <: Sj. (It is a compile-time error of Ps if Xj occurs invariantly.)

Discussion

Subtype robust type patterns clauses are particularly useful with static extension methods, because we want to ensure that a subtype will successfully match a type patterns clause whenever a supertype is known to do so, because this means that we can safely match against the dynamic type in order to obtain access to the actual values of type arguments on the receiver type.

Type patterns that are not subtype robust are particularly attractive with dynamic tests. For instance, a test that x is Map<var X, var Y extends X> will only match when the dynamic type of the value of x is a subtype of Map<exactly S, exactly T> such that T <: S. This means, for instance, that we can use values from this map as keys in the same map. The ability to establish such type relationships is a significant expansion of the expressive power of Dart.

The special treatment of Null (and, in the future, an explicit bottom type would get a similar treatment) ensures that subtypes containing Null will have well-defined bindings for all type variables introduced by a type patterns clause. Otherwise, the pattern List<List<var X>> could soundly bind X to any type when matched with List<Null>.

In some sense that choice may seem to be unimportant, because any actual elements of the list well be the null object, so there is no way to disprove that "if this element had been a list, it's type argument would be T", for any T, but the current specification ensures that the chosen binding of X in the example will be such that the matched type (List<Null>) is "as close as possible" to the matching type (List<List<Null>>, based on the pattern and the binding of X to Null), which means that whenever a type S matches the pattern and is a supertype of the matched type, it is also a supertype of the matching type. This means that an extension method may be invoked on a receiver of type List<Null> whose static type is List<List<T>> for some T, and that method may create and return a List<List<Null>> based on the binding of X, and the returned list will then actually have the type List<List<T>>, no matter which T it is.

The notion of a 'corresponding type' for a type variable with respect to a pattern and a type has a simple intuitive interpretation: We simply "look up the actual type argument at the same position as the given type variable in the pattern".

The example where the pattern Map<var X, X> will match Map<num, int> and bind X to num shows that this allows for a certain flexibility. But the pattern Map<X, var X> with the same type illustrates that it is possible to select a value for X (namely num) such that the given type Map<num, int> is a subtype of the type Map<num, num> produced by the match. Still, the match fails because it only considers binding X to int. This example illustrates why it makes sense to think about the occurrence of X which does not have the marker var as a "constraint".

It would also be possible to use a pure constraint based approach where all occurrences of X are treated as potential sources of information about possible values of X (as constraints on X). This would be more powerful than the approach that we have actually chosen, but also considerably more complex, presumably both in the implementation and when reasoning about what it does.

We believe that the chosen approach is a useful trade-off between expressive power and complexity, because it helps keeping the complexity of the matching operation low, and it keeps the results more predictable, especially in the case where an interface type has a type argument which is used in a contravariant location in a member signature.

eernstg commented 5 years ago

Yes, Map<var X, X> will only bind X, but Map<var X, var Y extends X> will bind both X and Y, so only the latter will give access to the 2nd type argument of the dynamic type of the target object at Map. For instance:

main() {
  Object map = ...; // Assume non-empty.
  if (map is Map<var X, X>) {
    X x = map.values.first; // Safe (upcast).
    map[x /*safe (same type)*/] = x; // Dynamic check on passing `x`.
  }
  if (map is Map<var X, var Y extends X>) {
    Y y = map.values.first; // Safe (same type).
    X x = y; // Safe (upcast).
    map[y /*safe (upcast)*/] = y; // Safe to pass `y`.
  }
}

But they are similar in that they will both match exactly those types S where Map<S1, S2> is a superinterface of S and S2 <: S1.

lrhn commented 5 years ago

The runtime type can be num, for example:

Map<var X, X> map = <num, int>{1: 1};

This assignment should succeed, implying that the type pattern match was successful. (You can't access the X variable anywhere, but the constraint that it introduced was enforced).

eernstg commented 5 years ago

@lrhn already mentioned an example where we'd allow type patterns as variable type annotations (which is not a case that I've considered, but we could do that and then put X in the current scope, and note that map has static type Map<invariant X, X> such that certain dynamic checks in the subsequent code can be omitted).

Just to make it fully explicit: Yes, var X is intended to facilitate dynamic binding of a type variable to the value which is obtained by inspecting the dynamic type of a given instance, as part of the matching process.

However, it's also relevant to perform matching for a given static type with a given pattern. In particular, that's the way we'll decide whether a given extension methods is applicable or not with a given receiver expression.

The notion of being 'subtype robust' is needed with extension methods, because we wouldn't want a matching process to fail at run time. That is, we must establish the guarantee that if the static type matches then the dynamic type will also match (and we rely on the general soundness property that the dynamic type of any given instance is always a subtype of any static type we could encounter for that instance).

is it true that Map<var X extends Y, var Y extends X> will match iff X is exactly the same as Y (that is, <num, num> will match, but <num, int> won't?

Yes, that is true. But this is not a subtype robust pattern, so you would not be able to use it to specify applicable receiver types for some extension methods. If we allow type patterns in type tests then you could use it like this:

void foo<U, V>(Map<U, V> map) {
  if (map is Map<var X extends Y, var Y extends X>) {
    // Known: `map is Map<invariant Z, invariant Z>` for some `Z`.
    // We also know `Z == X == Y`, but the type system only knows that
    // `X <: Y` and `Y <: X`, so we may need to use `X` and `Y` carefully,
    // unless we get support for invariance such that we can say it directly:
    if (map is Map<invariant X, invariant X>) {
      // It's possible that the type system won't ever be smart enough to know
      // that there is no need to generate code for the above test: It's guaranteed
      // to be true, but it will probably be checked.
      ...
    }
  }
}
eernstg commented 5 years ago

use [type patterns] in the "on" clause of extension methods.

For that, it's required that the type pattern is subtype robust.

extension Foo<X,Y>
    on Map<var X extends Y, var Y extends X>, X extends num, Y extends num { ... }

This is a compile-time error because the given type pattern is not subtype robust: X and Y both occur 3 times, and anything above 1 eliminates subtype robustness.

The problem is exactly what you say: We can check whether there is a match with the statically known type of any given receiver, but that doesn't provide a guarantee that a match on the dynamic type will succeed. Such a guarantee only exists with subtype robust patterns.

But for usages where we do not insist that the match must be guaranteed to succeed at run time, there is no problem:

bool foo(Map map) =>
    map is Map<var X extends Y, var Y extends X>, X extends num, Y extends num);

main() {
  foo(<num, num> {1: 1}); // True.
  foo(<int, int> {1: 1}); // True, also if we cast it to `Map<num, int>`.
  foo(<num, int>{}); // False.
}
eernstg commented 5 years ago
extension Foo<X,Y>
    on Map<var X extends Y, var Y extends X>, X extends num, Y extends num {
  void foo() {} // No need to use `X` or `Y`, any old method will do.
}

main() {
  Map<num, num> map = <num, int>{};
  map.foo(); // Static match succeeds, dynamic match fails.
}

So the problem is simply that we have a "no such extension method" event (similar to noSuchMethod for regular instance method invocations), and that should never be possible for a statically checked invocation of an extension method (which means: any invocation of an extension method, because they simply cannot be invoked on a receiver of type dynamic).

the "on" clause gets satisfied only if the compiler knows the exact types X and Y,

We could get this kind of knowledge if Dart is extended with invariance (#229, #214), but otherwise it would basically destroy the usefulness of such an extension (because Dart today has only few kinds of expressions whose type is exact, and it is not even specified). Because of this, I'm proposing that extensions just cannot have type patterns that aren't subtype robust.

eernstg commented 5 years ago

Yes, I do want to maintain a distinction here.

When a method parameter is covariant there will be a dynamic type check, and that's an unavoidable consequence of having (1) covariance for generic classes, and (2) the ability to use a type variable of the enclosing class in a contravariant position in the signature of a class member. We will probably have some sort of invariance which will make it possible to eliminate this kind of dynamic check (gradually). The invocation of operator []= that you mention has just such a check on its 2nd argument.

A member lookup is technically a different thing than a dynamic check on the type of an instance, and I think it makes sense to consider it different conceptually as well. We already have a clear and sound model for this area: A dynamic member lookup is inherently capable of failing; every other member lookup is statically safe. So if we promise that e has a foo at compile-time then e will have a foo.

If we accepted e.bar(42) at compile-time because extension Bar has a suitable bar method and the static type of e matches Bar, but at run time the extension Bar cannot be used after all (because the dynamic type of the instance obtained by evaluation of e does not match Bar's requirements), then it is as if e had a bar method statically, but it does not exist dynamically. I think the consistent treatment of this situation is to make it a compile-time error. This is different from passing an ill-typed argument to bar, because there is no bar and we never get far enough to even consider which parameter types bar could have declared, nor whether any given parameters will fit.

eernstg commented 5 years ago

"I acknowledge that calling extension method "foo" can fail in runtime, but I take all blame upon myself".

Can do! (.. if we allow type patterns for promotion):

main() {
  var map = ...;
  if (x is Map<var X extends Y, var Y extends X>, X extends num, Y extends num) {
    map.foo();
  }
}

You can't really use a "testAndPromoteToNonNull" ! like operator to specify that an extension method invocation must be attempted, because we need to know at compile-time which static extension to select (at least, I certainly wouldn't want a construct which will just search all the extensions in scope and see if one of them works).

It might be useful to introduce an explicit matching test, which would specify a static extension by name and evaluate to true iff the given object matches the type pattern in the given extension:

main() {
  var map = ...;
  if (x matches Foo) map.foo();
  // Or maybe even..
  (x as Foo).foo();
}

But that's an extension that we can add on later, if it turns out to be sufficiently useful in practice.

noSuchMethod

I'm not convinced that it would be sufficiently useful. ;-)

eernstg commented 5 years ago

(Static overloading is the Oxycontin of programming. ;-)

eernstg commented 5 years ago

Scoped class extensions are intended to use type patterns and the associated matching to handle generics (that was one of the goals for having type patterns in the first place).

I worked on the proposal for doing the same thing with scoped static extension methods first, because they are a bit simpler, but the idea would be exactly the same.

extension ShortToString on Iterable<var X> {
  // Similar to `toString`, but abbreviates the elements to an ellipsis.
  String shortToString => "<$X>(...)";
}
on List<var X> {
  String shortToString => "<$X>[...]";
}
on Set<var X> {
  String shortToString => "<$X>{...}";
}
eernstg commented 5 years ago

I haven't worked my way through all the implications, so there might be some corner cases that we need to handle, but I expect arbitrary type patterns to be applicable to scoped class extensions (#177) as long as they are subtype robust. The example you mention should not create any problems:

extension ShortToString on Iterable<var X extends num> {
  // Similar to `toString`, but abbreviates the elements to an ellipsis.
  String shortToString => "<$X>(...)";
}
on List<var X extends num> {
  String shortToString => "<$X>[...]";
}
on Set<var X extends num> {
  String shortToString => "<$X>{...}";
}

As an example of something that would violate subtype robustness, you would not be allowed to introduce stronger constraints on subtypes:

extension Whatever on Iterable<var X> {
  bar() { ... }
}
on List<var X extends num> { // Compile-time error! See `main` to see why.
  bar() { ... }
}

main() {
  Iterable<Object> it = <String>[];
  it.bar(); // Dynamic match failure: A `List<T>` only matches when `T <: num`!
}

This is a plain subtype robustness violation, because the dynamic type List<String> fails to match, even though there is a static match (it is an Iterable<Object> statically, and that matches Iterable<var X>).

eernstg commented 5 years ago

we might not need multiple "on" variants when we have a common interface for all variants

What you don't get, then, is all the usual sanity checks on methods: If you have such a manually written chain of ifs, how do you obtain the normal inheritance and override relationships for methods?

extension E on A {
  foo() {} // Inherited
  num bar(int i) => 2.0; // Overridden
}
on B /*subclass of A*/ {
  int bar([num n]) { foo(); return 3; }
}

If you are going to express a bunch of methods like this, some of them inherited and some of them overridden here and there in the target hierarchy, some of them changing signature in the override, how would you then be able to get the "proper" return type (if you call bar on a receiver whose static type is B then the returned value should be given type int, not num)? How would you get your override relationships checked properly? In particular, bar on a B accepts a num, optionally, and maybe you got it wrong that this is an OK override for a required int parameter, but the compiler won't know that you intend those ifs to model a bunch of methods with different signatures. Also, how would you enable a call site where the receiver has static type B to call bar without an argument? Etc. etc.

So the point is that you can use ifs and type tests to run the piece of code that you want, but you can't give that piece of code the same detailed sanity checks and typing that you can with regular methods.

Of course, it would also be possible for a compiler to compile the invocation of foo in B.bar to a direct jump, because it is a baked-in property of scoped class extensions that they are sealed. That level of performance would also be difficult to achieve with a bunch of static extension methods containing chains of if statements.

eernstg commented 5 years ago

more than the parity

The goal is not to avoid expressive power, the goal is to have a mode of thinking which will allow developers to understand Dart as a whole with a minimal amount of accidental complexity. For instance, if it will inform the developer about almost everything which is relevant to think "an extension method is a method" then that's a good way to avoid having a lot of unnecessary complexity built into the notion of an extension method in the first place.

Then there may be a need to think about the fact that an extension can't have other things than instance methods (e.g., no state), and hence some programming idioms won't work. But that's a constraint, it is not an additional amount of complexity being piled on the semantics of an extension method as such.

On top of that we have one new concept: At the call site, the actual type arguments of the receiver at the specified type pattern are extracted as specified in the type pattern of the extension (and that's a genuinely new mechanism). The ability of a method in an extension to use the type variables which are declared in enclosing scopes is again the same as usual: The type variables can be used just like other type variables, and they are known to have an actual value which satisfies the declared bound.

the parity in question is not feasible

It wasn't a goal, so we shouldn't work ourselves to death to get it. The important point is that comprehension of the mechanism should not be hindered by accidental complexity.

variants should include the case "none of the above"

It is not just "none of the above", because a third-party class could implement any of the classes that we do have a case for. The idea is that we will run the code for A on an instance of a third-party class that is a subtype of A and not a subtype of any of the other cases; and the code for B on an instance that is a subtype of B (hence also A, but none of the others), etc.

If you claim to "be a B" then the code for B ought to work for you!

Of course, this is one thing that developers of extensions will have to have in mind, and it is something that third-party classes may not be built for.

But this is a matter of culture: If we're working with extensions of an important type hierarchy, and third-parties provide new nodes to that type hierarchy (new subtypes) which are not just slight variants of the ones that we already have a case for (the code "ought to work for" a slight variant!), then it may be necessary to extend the extension.

I showed one way to do that: Create a separate interface EvalThirdParty and allow those third-party classes to implement that interface, and then use a type test (if (o is EvalThirdParty) return o.evalThirdParty()) to delegate the work for an extension method to a regular instance method.

The culture part would then be to put pressure on those third parties to implement that interface. Alternatively, we could possibly integrate some of those third-party classes into the main subtype hierarchy, and add some cases to the extension to handle them.

This is a more powerful mechanism than regular instance methods, because it will extend a subtype hierarchy with methods supporting late binding (aka OO dispatch), and they are automatically "inherited" by all subtypes. So if you're working on methods where the rationale is applicable ("you claim to be a B, so we will run the B code for you, even if you're an instance of B_ThirdParty that we never heard about") then this will do things that regular instance methods can't do.

I think the trick will be to use regular instance methods whenever that is possible, and then switch to extension methods (static or dynamic, depending on your needs) when needed, and at all times be aware of the rationale. For instance,

eernstg commented 5 years ago

There's nothing like VTABLE that can be used to look up the actual method in runtime

When a scoped class extension (#177) has a set of target types which are distinct classes (that is, stuff like extension on A .. on B<int> .. on C<var X, var Y> is included, but extension on List<A> .. on List<B> is not, because List occurs more than once), we can use a mapping directly from the class to the relevant extension object (there is one extension object which will do everything we need for an A, and another one for B, etc). That map takes a class and returns an entity, which is a task that we can solve using a table per extension. We could use a hash map, or we could work a little harder and use a fixed, statically known index for the given class. With the latter, it would be precisely like a vtable lookup.

However, the returned result is not a pointer to code, it is an extension object. The next step is to invoke the extension method on that extension object, and that's a completely normal method invocation.

So it's a double-dispatching mechanism (the cost is about the same as two method invocations), which shouldn't be too surprising, given that we have a bunch of classes and a bunch of extensions, and we are calling the right extension method for that class.

A lot of explaining is necessary

Trying. ;-)

var x = [1, 2, 3];
List<num> y = x;
var sum = y.sum();

I can't find any methods named sum, but if we are invoking an extension method sum on a receiver of type List<num> then the basic (unoptimized) approach would be to check every case of the extension from the end and backwards, and running the code for the first case where the run-time type of the receiver matches the given type pattern (say List<var X>). "Running the code" means obtaining the extension object o and executing o.sum(y) (or o.sum<num>(y), if the case has specified that it needs access to the type argument).

If the cases have distinct classes (and the compiler generates "vtable" entries for all classes that it knows about at the location where the extension is declared, that is, including the ones that the platform can return) then we can expect to look up the correct extension object o in a single step (from the "vtable" or from a hash table, or whatever the compiler has chosen to use).

So we can certainly have some (worst) cases where the selection of the correct extension object is linear in the number of cases in the extension, and we can have other cases where it is O(1). It is possible that we can always optimize to some extent, but it will certainly never have to be worse than a linear search in the list of cases (which is what we would have with a chain of if statements).

Then how "var X" comes into play

If we have cases that target the same class and only differ in type arguments, using patterns that differ in subtle ways, then it is possible that we can't optimize anything: We will need to perform the matching operation on the more specific pattern, and continue with the next more general one if it fails. Each matching operation will bind type variables along the way, and if this is a violation of the declared bound then the match fails.

eernstg commented 5 years ago

[Assume that] we write 3 variants of the method "sum": one for Iterable<num>, another for Iterable<int>, third - for Iterable<double>.

Ah, I overlooked that, then we would have the following:

extension E on Iterable<num> { // Extension object: eoE1.
  sum() => e1;
}
on Iterable<int> { // Extension object: eoE2.
  sum() => e2;
}
on Iterable<double> { // Extension object: eoE3.
  sum() => e3;
}

Again, unoptimized dispatch on y.sum() would amount to

y is Iterable<double>? eoE3.sum(y) : y is Iterable<int>? eoE2.sum(y) : eoE1.sum(y)

where we would in general use something like a let to ensure that the receiver is only evaluated once (but here it doesn't matter, y is a local variable).

An optimization strategy which could be applied here is to note that we have a "lifted" set of distinct class types, so we could just extract the actual type argument of the receiver at Iterable, and perform a lookup from that class to the correct extension object (eoEj, for some j), and then the unoptimized path would only be used for other types (here: it could only be Null, so we could actually just include that into the table).

eernstg commented 5 years ago

@tatumizer wrote:

Important: the bundle is identified based on static type only.

Exactly. (I say 'an extension' where you say 'an extension bundle', and I say 'a case of an extension' where you say 'an extension'; I'll stick to the terminology that I've used so far).

This decision process is not trivial, but it is not very complex, either: A match between a type and a pattern produces a 'matched type', Basically, it's the pattern with the chosen value for each primitive type pattern inserted; so when List<int> matches Iterable<var X extends num>, the matched type is Iterable<int>. And then we take the usual step of selecting the most specific matched type and raising an error if there is no such type (that is, there are zero matching extensions, or there are multiple, and none of them is most specific).

What to do if the type is neither exactly Literal nor exactly Sum?

The description after that sentence is exactly correct. It is always correct to search backwards through all cases, and we have a guarantee that there will be a match, but for an exact match (like "the dynamic type is Literal) we can optimize.

the notion of "decision tree"

Given that disambiguation is handled by forcing a linear specification (so the author of the extension must decide which is more specific when two types are not subtypes of each other), it is actually not necessary to consider a tree, a simple sequence will do. It might be possible to optimize various special cases using some search trees, but I don't have a final overview of that.

counterexample

In the given example the static type Iterable<num> matches the initial case in the extension (and we assume that there are no conflicts, e.g., because there are no other extensions). At run time, the first matching case (from the bottom and backwards) is Iterable<int>, so that's the one that runs.

We don't have any soundness issues, because there is no assumption (based on the initial case) which is not satisfied by the run-time choice.

We could have a slightly more tricky situation like this one:

extension Foo on Iterable<num> {
  foo()=> e1;
} on List<num> {
  foo() => e2;
} on Iterable<int> {
  foo()  => e3;
}

main() {
   List<num> list = <int>[1, 2, 3];
   list.foo();
}

In this situation the statically matched case is List<num> and the dynamically matched case is Iterable<int>. You could say that we aren't running the code that we expected statically, but that should not be so much of a surprise in an OO language. However, it does mean that we have to require that foo on the 3rd case of Foo is a correct override of foo on the 2nd case of Foo.

Given that the dynamic type of the value o of an expression e will always be a subtype of the static type of e, we will never have the situation where the statically most specific matching extension case is later (nearer to the end) than the dynamically matching case, so the requirement "later case must be correct override of earlier case" is sufficient to ensure soundness of invocations.

eernstg commented 5 years ago

@tatumizer wrote:

Thanks for all explanations!

Thanks!

The user has to put the variant deemed "more specific" later than "less specific" in the sequence of "on" clauses.

Exactly, and this applies to all cases, even when we have cases which are unrelated classes (say, B and C which are not subtypes of each other), because the actual receiver could always be a subtype of both (class D implements B, C {...}), so it does matter whether one or the other is given priority. Developers can make that choice and write the code accordingly, but a compiler should not.

so we are able to specify more than one "root"

That's a compile-time error, the first one must be a supertype of them all. But the whole set of cases in an extension must form a tree according to the subtype relation, so the situation where one case is the "root" of a tree of cases further down is present everywhere (noting that the tree may be just the root, because we have reached a leaf of that subtree).

It might be a good idea to show syntactically that the first case is required to be a supertype of them all ... but since each case further down may have a similar subtree it doesn't seem like a very special property of the first case. It is already a very visible property to be at the front and have the name. ;-)

eernstg commented 5 years ago

B is a supertype of A - static error?

Yes (search the proposal for D1 <: D2).

all decisions (I think) are based on static types only

For static overloading? Yes.

But it is definitely not true for extension methods: The main point of the whole 'scoped class extension' mechanism is that the extension object is selected dynamically, such that you invoke an extension method which is suitable for the dynamic type of the receiver, not just the one that you know about statically. The static properties are needed in order to ensure that this will always be possible.

eernstg commented 5 years ago

I think they might just stay separate statically:

extension SumNum on Iterable<num> { ... }
on Iterable<int> { ... }
on Iterable<double> {
  // All three cases would have this implementation, just changing the type,
  // such that a compiler can generate optimal code for each type.
  double sum() {
    double result = 0.0;
    for (var d in this) result += d;
    return result;
  }
} 

extension SumBigInt on Iterable<BigInt> { ... }

main() {
  var xs = [1.2, 3.4, 5.6];
  double sum = xs.sum(); // No downcast, statically a `double`.
  List<num> ys = xs;
  double dsum = ys.sum(); // Downcast `num` to `double`, but runs the same code.

  // `BigInt` stuff is similar, but with no subtype relation to `num`.
}
eernstg commented 5 years ago

@tatumizer wrote:

allowing an extension to implement some interface

That's actually a property that we get for free with the proposal #309 where extension methods are generalized to be based on wrapper objects (which may then be compiled away as long as we stay within the subset which is similar to C# extension methods).

TzviPM commented 1 year ago

Can someone provide an example for where this var X syntax would be necessary other than a dynamic check like foo is Map<var X, var Y extends X>? It seems to me that the extension methods could work just as well without this added syntax:

extension ShortToString on Iterable<X> {
  // Similar to `toString`, but abbreviates the elements to an ellipsis.
  String shortToString => "(...)";
}
on List<X> {
  String shortToString => "[...]";
}
on Set<X> {
  String shortToString => "{...}";
}

The example above with using the static type information to change the implementation looks like it's more a matter of the Type variable being static, is that correct?

extension ShortToString on Iterable<X> {
  // Similar to `toString`, but abbreviates the elements to an ellipsis.
  String shortToString => "<$X>(...)";
}
on List<X> {
  String shortToString => "<$X>[...]";
}
on Set<X> {
  String shortToString => "<$X>{...}";
}

Or would this work dynamically as well? It just seems weird to me that we'd need the keyword var when X here is already a type variable. What am I missing?

eernstg commented 1 year ago

First a bit of notation: Recent discussions about features in this area have used final rather than var, so you'll see final X rather than var X in many issue comments. I'll use that here, too.

The fundamental notion of type patterns allows for a set of different mechanisms. One of the crucial distinctions is whether the semantics of a type pattern is based on the run-time values of type variables, or it is based on static information.

In the case where the semantics is based on the run-time value the mechanism amounts to a kind of existential open:

X id<X>(X x) => x;

const elements = [1, true, 'Hello', <int>{}, id, 2.5];

void f(Object o) {
  if (o is List<final X>) {
    for (var element in elements) {
      if (element is X) o.add(element); // (1)
    }
  }
}

void main() {
  f(<num>[]);
}

At (1), an element of type X is added to o. This operation is statically safe because o has been tested by the condition of the if-statement, and X was bound to the actual value of the type argument of the run-time type of o at List (the run-time type could be Int8List or some other type that has no type arguments or different type arguments, but we're asking about the type arguments "at List", which means that we're finding the superinterface of the run-time type which is a generic instance of List, and that type has a specific type argument, and that type argument is exactly the kind of elements that the list is able to contain).

This means that the loop will put exactly the objects into the list that are type correct elements of the list, and there will not be any run-time type errors.


However, the semantics which is based on the statically known types can also be useful in some cases:

Map<X, String> f(List<final X> list) {
  X x = list.first;
  return <X, String>{first: '$first'};
}

void main() {
  Int8List xs = ...;
  var map = f(xs); // `map` has type `Map<int, String>`.
}

In this case we assume the static semantics. This implies that list in the body of f is known to have a run-time type R such that R <: List<X>, and in particular, the type argument of R at List is a type S such that S <: X. This again ensures that it is a safe operation to create a binding with key first and value '$first' in a map of type Map<X, String> (OK, assuming that the list isn't empty ;-).

Note that it would not be safe to insert an element of type X into the given list, this might throw at run time because the object might not have the required type.

Another shortcoming with the static semantics is that there is no hope of getting a useful result from a type pattern matching operation where the existing static type does not embody the relevant information:

void main() {
  Object o = <int>[];
  if (o is List<final X>) {
    // `X == dynamic` or `X == Object?`, and we don't know anything new.
  }
  Iterable<num> iter = <int>[];
  if (iter is List<final X>) {
    // This should allow us to get `X == num`. But we might as well
    // have used `iter is List<num>` in the first place.
  }
}