Meaning of annotated types

gomoripeti commented 5 years ago

Existing documentation:

The erlang documentation for types and specs does not mention annotated types at all http://erlang.org/doc/reference_manual/typespec.html
The Erlang documentation of the abstract format has the following to say (http://erlang.org/doc/apps/erts/absform.html#types)

If T is an annotated type A :: T_0, where A is a variable, then Rep(T) = {ann_type,LINE,[Rep(A),Rep(T_0)]}.
The Elixir documentation states (https://hexdocs.pm/elixir/typespecs.html#defining-a-specification)

You can also name your arguments in a typespec using arg_name :: arg_type syntax.

From all this I deduct that in an annotated type A :: T A is not a type variable or type, and :: does not says that "A is a subtype of T", but rather "A is a name or variable name, the value of which has the type T". It is common practice in Erlang and Elixir code that A is the same as the variable name in the head of the function implementation.

Although Gradualizer can give any meaning to existing type/spec constructs but I think the recent tendency was more to support the existing usage patterns and to be able to support as much existing code as possible. (A good example is the practical usage of what looks like bounded polymorphic types)

Therefore I suggest to treat the annotation of annotated types as shear documentation and ignore it as soon in the process as possible. @josefs @zuiderkwast @Zalastax @UlfNorell please share your views in this thread!

zuiderkwast commented 5 years ago

Another possibility would be to treat them as introducing a type alias, so that

-spec foo(A :: string(), A) -> ok.

means the same as

-spec foo(A, A) -> ok when A :: string().

I've never seen any code using them like this though, so I think just dropping them makes more sense.

What does dialyzer do with a spec such as the first one above? Is it even valid?

Zalastax commented 5 years ago

@zuiderkwast annotated types sure look a lot like type variables used in specs but I'm not certain they are the same. It makes sense for them to be the same so that type variables can be expressed in fun types, but fun types don't support conjunctions so we can't expect feature parity.

Type variables are weird so your spec actually says that foo accepts two strings that have to be the same...

(from typespec):

Type variables can be used in specifications to specify relations for the input and output arguments of a function. For example, the following specification defines the type of a polymorphic identity function:

  -spec id(X) -> X.

Notice that the above specification does not restrict the input and output type in any way. These types can be constrained by guard-like subtype constraints and provide bounded quantification:

  -spec id(X) -> X when X :: tuple().

Currently, the :: constraint (read as «is a subtype of») is the only guard constraint that can be used in the when part of a -spec attribute.

Note

The above function specification uses multiple occurrences of the same type variable. That provides more type information than the following function specification, where the type variables are missing:

  -spec id(tuple()) -> tuple().

The latter specification says that the function takes some tuple and returns some tuple. The specification with the X type variable specifies that the function takes a tuple and returns the same tuple.

However, it is up to the tools that process the specifications to choose whether to take this extra information into account or not.

zuiderkwast commented 5 years ago

Type variables are weird so your spec actually says that foo accepts two strings that have to be the same...

@Zalastax: I'm assuming we're talking about -spec foo(A, A) -> ok when A :: string(). here and not the annotated types variant. (Other readers, please skip reading if you're only interested in annotated types.)

Doesn't it mean "for every subtype A of string()"? For a type where singleton subtypes exist (e.g. A :: integer()), I agree that both values have to be exactly the same. For a A :: float(), we don't have any subtypes (other than none()), so isn't enough then that both are floats? For A :: string(), it should be enough that both strings contain the same characters. The most exact type of "foo" is [$f | $o], right?

Zalastax commented 5 years ago

@zuiderkwast you might be right, but in that case the note really tricked me up.

The behavior @gomoripeti proposes makes sense to me. The first spec by @zuiderkwast should be rejected since it uses A for two incompatible concepts.

It's really bad that :: is reused. I've seen a lot of whens which should have been annotated types instead. when is similar to type variable constraints in Haskell and therefore it also makes sense for :: to mean "is subtype of" in the context of when. Unfortunately, we have to deal with the fact that a lot of the code that exist out there is wrong somehow.

josefs commented 5 years ago

Thanks for starting this discussion @gomoripeti!

From all this I deduct that in an annotated type A :: T A is not a type variable or type, and :: does not says that "A is a subtype of T", but rather "A is a name or variable name, the value of which has the type T". It is common practice in Erlang and Elixir code that A is the same as the variable name in the head of the function implementation.

Yeah, I've always thought that the description of type annotations was very weird because reading it literally actually means that the type system supports dependent types. But rather useless dependent types. It is possible to introduce variables which range over elements of types but there is nothing useful we can do with them because we don't have any form of computation in the type system. So it is actually completely pointless to treat these variables as ranging over values. So I agree with your reading of the documentation @gomoripeti but I think the documentation is bad.

Instead of treating annotated types as useless dependent types I think it makes much more sense to put them to good use. As @zuiderkwast proposes, I think we should treat every type annotation as a constraint so that -spec foo(A :: string()) -> A. is treated the same as -spec foo(A) -> A when A :: string(). I think that is the most sensible thing to do. Note that if we have multiple annotations of the same variable the interpretation by @UlfNorell would kick in and we would replace the variable with the glb of the types.

I would personally be very surprised if I were to write A :: string() somewhere and the find that gradualizer would not complain if I instantiated A with the type integer(). That is, I don't think we should throw these constraints away.

@Zalastax, you say:

Unfortunately, we have to deal with the fact that a lot of the code that exist out there is wrong somehow.

Can you give an example of some code which would fail if we chose to interpret annotations as constraints?

I apologize, @gomoripeti, for barking at you in you recent PR about removing annotations. We already have a lot of code which removes the annotations and that is my fault for being lazy and not handling these cases appropriately. I should have communicated my intentions better beforehand.

Zalastax commented 5 years ago

@josefs the only place where it matters is for return types since subtype constraints are the same as just using the type directly (since we always have subtyping for parameters). @UlfNorell gave a good example somewhere. It was similar to this:

-spec const_int() -> N :: integer()

which doesn't have any correct implementations using constraints but it does under @gomoripeti's interpretation.

Zalastax commented 5 years ago

One of the refactorings I did at work recently was to replace when with annotated types, like so:

-spec bar(L, Family, Name) -> {reply, ok, L} when
      L      :: l_data(),
      Family :: string() | atom(),
      Name   :: string() | atom().
%% Replaced with
-spec bar(
        L      :: l_data(),
        Family :: string() | atom(),
        Name   :: string() | atom()) -> {reply, ok, L}.

When doing so I

Expected the specs to mean the same thing
thought that writing L in the return type means the same thing as writing l_data(), i.e. there is no polymorphism.

I think we should have a normalization pass for both annotated types and type variables bound by when. This pass should just replace any usage of the "type variable" with the specified type. E.g.

-spec apa(A :: integer()) -> A.
%% Same as
-spec apa(A) -> A when A :: integer().
%% Same as
-spec apa(integer()) -> integer().

Additionally, the pass should report an error if a "type variable" has been constrained multiple times, e.g. as in

-spec bepa(A :: integer()) -> A when A :: boolean().

zuiderkwast commented 5 years ago

Additionally, the pass should report an error if a "type variable" has been constrained multiple times

@Zalastax, constraining a type multiple times can be useful, e.g. when you want to refine a large union type. GLB is what I'd like to see for these. I've done it in gradualizer_bin:

-spec compute_type(ExprOrPat) -> gradualizer_type:abstract_type()
        when ExprOrPat :: {bin, _, _},
             ExprOrPat :: erl_parse:abstract_expr().

Zalastax commented 5 years ago

@zuiderkwast, that's neat! What does dialyzer do with a construction like that?

Even though the example is a neat trick, I don't want that to be a pattern we promote. Instead, I think we should work towards extending the type system with conjunctions.

zuiderkwast commented 5 years ago

FYI, compiler doesn't like this form:

-spec foo(A :: string()) -> A.
foo(X) -> X.

It's rejected with this error:

$ erlc src/foo.erl 
src/foo.erl:4: type variable 'A' is only used once (is unbound)

Zalastax commented 5 years ago

@zuiderkwast that's true 😮 Your example behaves the same:

-spec foo(A :: string(), A) -> ok.
%% type variable 'A' is only used once (is unbound)

This means that annotated types shouldn't even introduce an alias unless we get OTP to change.

gomoripeti commented 5 years ago

@zuiderkwast that's a good point. So the compiler keeps names in annotation and type variable names in different namespace or scope. In this case I think we should also handle them separately.

Also we can ask the question (before we intend the OTP team to eliminate that compiler warning): what can we express with annotated types that we cannot with constraints?

Zalastax commented 5 years ago

@gomoripeti I'd like to turn the question around: why do we need constraints when we have types? @zuiderkwast showed that constraints can emulate conjunctions but I'd prefer to have conjunctions as a core concept instead. For parameters, constraints and types behave the same. For returns, constraints are very problematic.

zuiderkwast commented 5 years ago

Let's keep the topics clear. We're talking about the annotation syntax and whether it should be equivalent to the when syntax or not.

@gomoripeti AFAIK, there's nothing that can be expressed with annotated types that cannot be expressed with when.

The semantics of the when part is another topic. @Zalastax, are you saying when shouldn't introduce constraints? We're already substituting the variables with the types as defined after when, so the constraints are already gone, aren't they?

And by conjunctions, do you mean intersection types? If we want to write when ExprOrPat :: {bin, _, _} & erl_parse:abstract_expr(), maybe it's better to use & rather than ; to avoid problems with operator precedence. (Use & for types and ; to separate the clauses in a spec. Parentheses can't be used among the types, can they?) The problem with extending the type syntax though is that the programs won't compile. Maybe we can do in gradualizer 2.0 or so, after convincing OTP to add & to the type language........

Zalastax commented 5 years ago

@zuiderkwast yes, the constraints are already gone in the current implementation. My point was that I don't want us to change the implementation to introduce constraints instead because I don't believe they are the right approach given that we only have subtype constraints. As a "counterpoint", you showed one very neat use of constraints which gives you intersection types (which I called conjunctions) by using multiple constraints. I'd like to add intersection types as a native concept instead.

I propose this action list:

Drop the annotation, they are equivalent to comments
Add documentation in OTP that type annotations are equivalent to comments
Convince OTP to add intersections to the type language
Convince OTP that when should not introduce constraints
Convince OTP that types and specs need to be aligned and able to express the exact same thing

gomoripeti commented 5 years ago

I think we have a consensus about annotated types (being comments or similar)

...so I dare to diverge from the topic

we can add intersection types internally first without adding it to the annotation langauge (convincing OTP team) that's good
"when should not introduce constraints" - what do you mean by this? in my eyes when is for nothing else but constraints, they are even called constraints in the AST. If you mean that it should not introduce intersection, because we will have explicit annotation for that, I think constraints can come from different locations (I mean gradualizer constraints, not spec constraints) and merging two of them can result in such implicit intersection/glb, so why not allow this situation in the spec annotation too?
I couldn't agree more with all other points

Zalastax commented 5 years ago

@gomoripeti great!

My opinion is that when introducing constraints is confusing for most. It's a misfeature which I'd like to fix. The value it brings is triumphed by the confusion it causes and the extra concepts only it uses. I want to change the documentation to instead express what when does to the arguments type. Doing so would allow us to fix the weirdness of subtype constraints and return types. It would also allow us to say e.g. when A :: ty1(), A :: ty2() is an alias for when A :: ty1() & ty2().

The alternative is to push for extending the constraints language to be able to express more things and letting fun types express constraints too.

zuiderkwast commented 5 years ago

I think we have a consensus about annotated types (being comments or similar)

So, does this mean we can accept PR #178, which is where this discussion started? @josefs

zuiderkwast commented 4 years ago

Revisiting this. @josefs, do you still think that

-spec foo(A :: string()) -> A.

should be treated the same as

-spec foo(A) -> A when A :: string().

even though the first one is rejected by the compiler with the following error message:

src/foo.erl:4: type variable 'A' is only used once (is unbound)

When I discovered that the compiler rejects this, I completely changed my mind about these annotations. I think they should be simply discarded. They are documentation, nothing more.

The current Gradualizer behaviour for annotated types is quite useless, since it reports a lot of type errors which are not actual type errors.

josefs commented 4 years ago

@zuiderkwast, that's a very compelling argument. Crap! I hate it when others make good arguments and I have to change my mind! </sarcasm>

Alright, let's just go ahead and remove the annotations.

@Zalastax, do you want to keep this issue to track you action list or can we close it?

Zalastax commented 4 years ago

I hereby declare that we have resolved the meaning of annotated types.

zuiderkwast commented 4 years ago

Merged #228.

Remaining work is to remove unnecessary clauses matching ann_type in various places.

josefs commented 4 years ago

I've filed #247 to track removing the unnecessary clauses.

josefs / Gradualizer

Meaning of annotated types #180

Note