Parametric type parameters?

dart-lang / language

Design of the Dart language

Other

2.64k stars 198 forks source link

Parametric type parameters? #3620

Open eernstg opened 6 months ago

eernstg commented 6 months ago

We could support a modifier on formal type parameters indicating that it is never needed at run time, that is, no operations are performed where the value of the actual type argument passed to that formal type parameter is needed in order to obtain the correct semantics. For example:

X id<X>(X x) => x; // `X` is parametric.
bool foo<X>(Object o) => o is X; // `X` is not parametric.

A parametric type parameter does not have to have a representation at run time. This implies that it could be useful from a performance perspective to mark a given type parameter as parametric, if this is possible.

It isn't always possible, of course: It is a compile-time error to use a parametric type parameter in a way that has a run-time semantics. So if X is parametric then we can't do e is X for any e, or e as List<X>, we can't evaluate a list literal like <X>[], can't do print(X), etc. It is also a compile-time error to use a parametric type parameter as an actual type argument (or as a subterm of an actual type argument) which is passed to a type parameter that isn't parametric.

// Strawman syntax, let's just use a verbose keyword.

X id<parametric X>(X x) => x; // OK.
bool foo<parametric X>(Object o) => o is X; // Compile-time error at `is X`.
void bar<parametric X>() => id<X>; // OK.

It is more complicated, but it should also be possible to mark a type parameter of a class, mixin, enum, or mixin class as parametric, and maintain the same strict checking that it is indeed not needed at run time.

For dynamic invocations, we would need to eliminate any actual arguments that are passed to parametric type parameters as part of the dynamic invocation semantics itself. I don't know if this is going to be difficult, but it seems reasonable to assume that it is at least possible.

Note that a traditional motivation for having parametric type parameters is that they are associated with the ability to prove certain statements about a program. See Theorems for free for more about this topic. Dart will probably not have any strict and formal provability properties like this, but it is certainly true that the actual type argument cannot influence the behavior of a function invocation if that type argument does not have a run-time representation at all.

If we introduce parametric type parameters then we could also use this to enhance other kinds of checking: We might want to lint every location where a non-transparent extension type is used as an actual type argument for a formal type parameter that isn't parametric.

Another indirect benefit could be that it might be possible to eliminate a larger set of type parameters during compilation if we introduce parametric type parameters: We can safely eliminate all the parametric type parameters during compilation, but this could make it provable that some other formal type parameters can also be eliminated, possibly using a fixed point iteration to get rid of as many type parameters as possible.

@dart-lang/language-team, WDYT?

@mraleph, @mkustermann, @sigmundch, @rakudrama, how important do you think it would be to enhance the support for elimination of type parameters at run time? Does it actually reduce the time/space consumption by large programs in a way that matters?

lrhn commented 6 months ago

Strawman syntax: <static T> is a parametric/static only type parameter.

A type is static-only if a parametric type variable occurs anywhere in it.

A static-only type cannot be used as:

a type literal.
a runtime type check (is, as, try/on, pattern match required type, or occurring covariantly in a parameter type, or contravariantly in a return type.
as a type argument to a non-parametric type parameter.

You can use a parametric type parameter in a type alias, an extension or extension type. Or a class. But if it's a class, then you cannot check an instance against a concrete instantiation. Let's assume we could make Future be parametric. After all, if all the code is statically sound, then the result value will always have the correct type. Then you cannot do if (f is Future<int>)..., not even if you already statically know the the value is a FutureOr<int>. We might need some language assistance, like allowing asking is Future, and getting promotion to Future<int>, or allowing is Future<int> when int is statically known to be an upper bound on the parametric type parameter's actual value (no runtime type checking needed).

If you need to know whether a type parameter is parametric at the call point, it means instance members and function values can be restricted.

A function type R Function<static X>(P) is a proper subtype of R Function<X>(P). The former can be called with all type arguments, the latter only with non-static-only types. That subtyping also applies to virtual overriding. A static type parameter of s function must still be accepted at runtime, for dynamic invocations, even if it's immediately ignored. And the function type of a generic function should remember whether it's your parameters are static, so a function requiring an actual type argument won't get cast into a position where it's omitted.

With so these restrictions, it's there anything useful left? Since List, Set and Map cannot be parametric (add is covariant), any class that needs to store it's values in a container still also can't be parametric. It's probably mainly going to be function type parameters, not class type parameters.

If we get variance annotations, we can perhaps require parametric class parameters to be covariant or invariant.

eernstg commented 5 months ago

Very nice analysis!

Strawman syntax: <static T>

I like that! It lines up well with phrases like "static type" or "static analysis", suggesting something which is only a compile-time phenomenon.

A function type ...

I suspect we'll have to exclude function types: "It is a compile-time error for a function type to have a return type or a formal parameter type which is a static-only type."

The reason would be, loosely, that the signature types of a function object are inherently reified. For example, what would we do in the case where a T Function() object o is assigned to a variable of type dynamic, Object, Function, or Object? Function() whereTis or contains a static type variable? A subsequent test like o is Object? Function() would presumably yield true, but how about o is int Function(), in a situation where the value of T would have been int if it were reified?

The idea is that it cannot be observed that there is no reification of a static type parameter, because no code which would be able to see the difference is allowed to compile and run, and I don't think we can do that if some functions have static type parameters in their signature.

With so these restrictions, it's there anything useful left?

It should be possible to express all the designs using static type parameters that other programming languages are expressing if they have sound typing and do not reify type parameters. (For Java, for example, we would then restrict ourselves to programs that do not give rise to a warning about an unchecked exception.) That's not nothing...

lrhn commented 5 months ago

If T is a parametric type variable, then you can't create a function value with type T Function(), just like you can't create a list instance of type List<T>. You can't reify the type into a value.

That doesn't mean you can't have a function type T Function(), it just means you can't easily create a function to assign to it, at least not one which doesn't return Never. But if you already have one, known to have that type, then whatever actual runtime type it has should work, parametrically, with the way the code was invoked.

Example:

R apply<static out R, static in P>(R Function(P) f, P arg) {
   R Function(P) function = f; // Valid assignment, no runtime check needed.
   return function(arg);
}
int i = apply<int, String>((String s) => s.length, "arglebargle");

This should work. Every parametric type is statically sound, and there are no runtime type checks needed. You can use the function type, you just can't check it.

The problem isn't with parametric types occurring in function types, it's creating a new object with a runtime type where the type occurs, because that's a reified non-parametric occurrence.

I guess a parametric type can't be used as an upper bound of another non-parametric type parameter.

S foo<static T, S extends T>(T value1, S value2) => (value1 is S) ? value1 : value2;

If we invoke this function dynamically, (foo as dynamic)<double, int>(1.5, 2), then the implicit S \<: T check is a dynamic check. But so is:

T id<static T>(T value) => value;
(value as dynamic)<int>("a");

and if that id function cannot be parametric, then nothing can. That's, like, the most parametric function ever!

Maybe we just have to keep parametric type parameters for a second, while invoking a function dynamically. A function like id above can completely ignore its type argument, if it can assume that the invocation was sound. If not, it does need to check that value is T. Then it can throw away the type parameter. So most likely, we're going to make dynamic invocations more special, and allow functions like the ones above, including the S extends T type parameter. If the call is sound, we don't need to know what T is.

munificent commented 5 months ago

We could support a modifier on formal type parameters indicating that it is never needed at run time, that is, no operations are performed where the value of the actual type argument passed to that formal type parameter is needed in order to obtain the correct semantics.

I think a key problem here is that "is never needed" doesn't specify by whom. Consider:

class C<T> {
  // Nothing...
}

It seems like this class could make T parametric because it definitely doesn't care about T's reified runtime value. But for all C's author knows, there could be code in the wild that really really wants to do:

test<T>(C<T> someC) {
  if (someC is C<int>) print('A C of int!');
}

I suspect in practice that it would be hard for library authors to know when it's a good idea to opt out of runtime reification of type parameters not just for themselves, but for all users of their API.

If you really do want to drop a type argument on the floor, you can kind of do so today like:

class C<T> {
  factory C() => _ErasedC();
}

class _ErasedC implements C<Never> {}

lrhn commented 5 months ago

Maybe it's safer to only allow parametric type parameters on functions, not classes.

If we have class C<static T>{}, then you cannot ask is C<int>, that's not allowed. If your clients want to do that, then this is not the class for them. There can't be any contravariant occurrence of T in the class member signatures, but downcasting is precisely what you want for a covariant type, so I can see why someone would want it.

A parametric type parameter is really a way of saying that the class itself doesn't know what this type is, just it's the same type everywhere it occurs, and that any values of that type are valid. So you can't ask the object about the type, because it doesn't know.

But you also can't use a parametrization using Never if the type actually does occur covariantly. A

class Choice<static T> { 
  final T _true, _false;
  const Choice(T ifTrue, T ifFalse)
      : _true = ifTrue, _false = ifFalse;
  T choose(bool choice) =>
    choice ? _true : _false;
}

has the type it has, and can be safely upcast, but it doesn't know the type itself, so it can never be downcast. That's not a fit for every class, but it might be for some. I don't see a technical need to disallow it, but I'll also admit I don't see the big use-case for allowing it either.

Since the status quo is to do nothing, we may need some use-cases. (Performance is not a use-case.)

sigmundch commented 5 months ago

cc @fishythefish

Very interesting discussion - hard to say about the value we'd get for runtimes.

Dart2js already has a pass that removes type-parameters if they are not used at runtime. So chances are that if you start using parametric type parameters, we were already optimizing those away.

That said, dart2js can be helpful to determine the value of this feature. We have evidence that there are some savings from this, but the evidence changed over time (in fact, @fishythefish was going to analyze whether we should remove this optimization). Also, I am not sure that today's benefit are from cases we can statically enforce. For example, dart2js does remove the T in C<T> in the example above if nowhere in the program there is a x is C<Y>. In small programs with -O3 (that removes covariant checks), we can even remove T from List :open_mouth:

If it's valuable to the discussion here, we could look at what are the type-parameters eliminated today by dart2js on some sample apps and see whether they would be iteresting candidates for parametric type parameters.

rubenferreira97 commented 5 months ago

removes type-parameters if they are not used at runtime.

I was going to ask if removing the type information that is not being queried in the source code couldn't be a compiler optimization step, or if it would be too slow/complex to analyze for the entire program. However, @sigmundch somewhat answered my question by stating that Dart already does that in some runtimes.

As a language user, I'm almost certain that people would favor having these optimizations handled by the compiler rather than burdening users with the addition of a new keyword, which may even impose restrictions on consumers.

mkustermann commented 5 months ago

I suspect we'll have to exclude function types: "It is a compile-time error for a function type to have a return type or a formal parameter type which is a static-only type."

If T is a parametric type variable, then you can't create a function value with type T Function(),

Does that imply generic functions with static type parameters (or non-generic functions that use enclosing static type parameters)

cannot be partially instantiated
cannot be torn off
cannot be local closures (or: only closures that are named local function declarations and only allow calling them but not using them as objects) ?

Also the compiler may perform this optimization in more places - so having having this in the language doesn't eliminate the compiler work.

fishythefish commented 5 months ago

As a user of ML-derived languages, I'm generally a fan of parametric polymorphism, but I'm also unsure how much actual benefit we'll get from making this a Dart language feature. (Of course, if we make all type parameters parametric, that's another story. 😁)

I started drafting a longer reply yesterday, then ended up deleting it when I realized it was basically a +1 to everything Siggi said. Having had some more time to digest this thread, I have some additional thoughts:

The "RTI-need" computation in dart2js (the optimization that Siggi mentioned) has to construct and traverse a graph to determine which type parameter declarations are reachable from operations which would reify them. Being able to skip static T would make this a little cheaper, but AFAIK it's not a very expensive optimization to begin with. The debate over removing it is more driven by its maintenance burden than by its resource consumption.
I don't think this language feature will tip the balance when we decide whether to remove RTI-need. List alone is common enough that being able to erase its type parameter is significant, and --omit-implicit-checks will affect other type parameters too. Not to mention that many type parameters will never be made parametric even if they could be - though a lint could help.
What is the default .runtimeType of a value whose type contains a parametric type parameter? We can't simply say "you can't do that" based on the static type because the value might be assigned to a variable of type Object/dynamic/Function/etc. (Side note: the introduction of even a single dynamic or Function is often what makes RTI-need quickly lose value as programs grow.) Similar question for any other operations that would require reification.
It might be worth revisiting this feature if/when generics aren't all covariant and parameter checks are less prevalent.

A more general concern I have is that this seems like a language feature that asks users to make the effort to opt in, but primarily benefits the compiler. I suspect that such a feature is unlikely to be used extensively unless we can also extol the benefit to users (and more tangibly than free theorems).

I suspect in practice that it would be hard for library authors to know when it's a good idea to opt out of runtime reification of type parameters not just for themselves, but for all users of their API.

I'm not so sure about this. We take reification, type tests, etc. for granted in Dart, but there are very successful languages that deliberately lack these features. IMO, code that requires these features often suffers from poor data representation, similar to code that needs to use dynamic.

I think what makes this a hard decision for library authors is the lack of tangible benefit in exchange for imposing the restriction (in contrast with, say, class modifiers).

Since the status quo is to do nothing, we may need some use-cases. (Performance is not a use-case.)

Big +1. Are most users just going to perceive this as extra typing for no clear benefit?