josefs / Gradualizer

A Gradual type system for Erlang
MIT License
613 stars 35 forks source link

Bounded quantification #122

Open zuiderkwast opened 5 years ago

zuiderkwast commented 5 years ago

Picking up this topic from #93. I think we do already agree about these points:

But I think there are other cases where bounded quantification does make sense. I don't agree about this point:

[bounded quantification] is way too strict for variables occurring more than once

How so?

Would you really expect the function -spec foo([Y]) -> Y when Y :: number() to only type check when returning one of the numbers already in the list? No adding them up or multiplying them?

Yes, I do think it makes sense that foo/1 above must return one of the elements in the list. For many of the lists functions, this is what you want. Here are some examples (with only-once used variables unfolded):

-spec lists:any(fun((T) -> boolean()), [T]) -> boolean().
-spec lists:filter(fun((T) -> boolean()), [T]) -> [T].
-spec lists:map(fun((A) -> B), [A]) -> [B].
-spec lists:sort([T]) -> [T].

These often have when T :: term() in the spec, which I think shouldn't make a difference.

If we just unfold, we get [term()] or term() as the return type of many of these functions, which gives rise to a type error where e.g. integer() is expected. Here are examples of applying these functions:

-spec integer_sort(integer()) -> integer().
integer_sort(Xs) -> lists:sort(Xs).

-spec filter_even([integer()]) -> [integer()].
filter_even(Xs) -> lists:filter(fun (X) -> X rem 2 == 0 end, Xs).

I have a feeling that many of the remaining errors reported when gradualizing the gradualizer are from these, the errors involving term(). (Some specs are just wrong though, e.g. that of lists:foldl/3 which always returns term(). These we can override.)

So, I suggest we go for bounded quantification where it makes sense, in a pragmatic way. What do you think?

UlfNorell commented 5 years ago

All of these examples are truly polymorphic, i.e. there are no bounds on the type variables. What would make sense is to drop T :: term() annotations and treat T as a polymorphic variable.

zuiderkwast commented 5 years ago

True, dropping T :: term() is another approach. Then, I can't see any real world examples of bounded polymorphic variables. I can only come up with more unlikely ones like this one:

%% Sorts the points by their distance to origo
-spec numeric_pair_sort([T]) -> [T] when T :: {number(), number()}.

Is truly polymorphic variables is easier to handle?

How do you implement polymorphism? Do we simply store type variable bindings within the Env which is passed around?

UlfNorell commented 5 years ago

I don't think bounded polymorphism is much more difficult to implement, but I do think it's not what people want/expect.

To implement polymorphism we need to distinguish rigid and flexible type variables.

Handling rigid variables is straightforward, but I suspect we need to do some serious refactoring to deal with flexible variables. We're relying on being able to inspect types in a lot of places, so we either need a mechanism to postpone checks until we know the value of a flexible variable, or turn everything into constraints (which I suppose is one way of postponing things).

josefs commented 5 years ago

A few thoughts on handling flexible variables: I have already tried to pave the way for dealing with flexible variables.

The difficult part is to solve the constraints to eliminate the flexible variables. This has to be done after we've traversed the whole function.

Solving the constraints would be simple if it wasn't for gradual types. Normally subtyping is transitive and if we have the case that T1 <: A <: T2 (where T1 and T2 are concrete types and A is a flexible variable) then we can pick A to be a type between T1 and T2. (There is sometimes a choice as to what type to pick but it's a solvable problem that's not relevant to the current discussion.)

In the presence of the any() then subtyping is not transitive anymore (because we don't want integer() <: any() and any() <: boolean() to imply integer() <: boolean()). So we cannot use transitivity anymore to solve constraints. That's a real bummer. What to do? I've been thinking that we should just go ahead and use transitivity anyway despite it not being correct. What would be the consequence of that? It would mean that we sometimes generate an error message when the code is actually well-typed. That is somewhat suboptimal but at least it is a way forward and I don't have any other suggestion how to do it. If you guys have a better idea I'm all ears.

zuiderkwast commented 5 years ago

If we make sure we don't have any constraints with any(), we can use transitivity, right?

Either we never store any constraints with any() or (equivalently I think) we convert all any() to term() within the upper bounds and to none() within the lower bounds. What do you think?

zuiderkwast commented 1 year ago

Hello old thread. OTP functions where bounded type variables could make sense are lists:keydelete/3 and friends, operating on lists of tuples. Don't we want a spec like this for it?

-spec keydelete(Key, N, [A]) -> [A] when A :: tuple().

It came up in #490. @UlfNorell @josefs WDYT?

erszcz commented 1 year ago

Don't we want a spec like this for it?

-spec keydelete(Key, N, [A]) -> [A] when A :: tuple().

Which would specifically make sense as #r{} :: tuple() holds for any record #r{}.

Another mechanism which could "preserve" the type of list elements from input to output would be an intersection A & tuple(), also found in literature. Given there's no Erlang spec syntax for intersections, I think this interpretation of A :: tuple() could also be practical 🤔

zuiderkwast commented 1 year ago

A & tuple() :+1: :+1: :+1: