ollef / Bidirectional

Haskell implementation of Dunfield and Krishnaswami's "Complete and easy bidirectional typechecking for higher-rank polymorphism"
BSD 3-Clause "New" or "Revised" License
129 stars 12 forks source link

How to extend with type applications #5

Open atennapel opened 6 years ago

atennapel commented 6 years ago

Do you have any idea how to extend the system with type applications. I don't know how to extend the subtyping relation, because in general I don't know whether to compare the type applications covariantly or contravariantly. Is the only solution to fall back to unification (unifying foralls using alpha equality)?

ollef commented 6 years ago

Do you mean type applications on the type level as in List Int (and not https://ghc.haskell.org/trac/ghc/wiki/TypeApplication)? As far as I know the common approach is to fall back to unification for exactly the reason you state. :)

ollef commented 6 years ago

It would be interesting cook up some scheme to get more programs to typecheck, e.g. to treat arguments to instances of the Functor and Contravariant classes covariantly and contravariantly, but I'd only do that if I knew I needed it.

atennapel commented 6 years ago

Yeah that's what I thought. I wonder if you could look at how type variables appear in the ADT and tag the type variables with their variance (invariant, co- contravariant or bivariant). Then you would know how the subtyping rule works for that type. Higher-kinded type variables make this harder or impossible though, since you don't know the variance until the HKT variable is instantiated...

ollef commented 6 years ago

Yeah, that might work. :)

If you haven't already it might be useful to look into how GHC does the Coercible class with "roles" and role inference for type parameters. It's not done for exactly the same purpose, but I believe it's quite similar to what you want to do.

ollef commented 6 years ago

For higher-kinded types, e.g. f a where f is unknown, I think the best you can do is to treat it invariantly (since f might be instantiated to both co- and contravariant types) unless you have more info about f, e.g. due to constraints on f.

atennapel commented 6 years ago

Thanks, I will look in to Coercible. Does that mean in the case of f a <: f b (where we have no information on f) you do both a <: b and b <: a? Or you use unification for a and b?

ollef commented 6 years ago

I mean unification.

Tangent:

I quite like the approach of elaborating to a core language, e.g. System F_omega (like GHC does), during type-checking, because that'll guide the implementation towards something where the types work out.

With elaboration, subtyping a <: b returns a function s of type a -> b, which you can apply during typechecking to your term of type a to get the b that you wanted. In the case of f a <: f b where f is unknown, you don't know how to apply s : a -> b to f a to get f b. If you know that f is a Functor, it could however be fmap s (or similarly but more efficiently coerce, which exploits facts about the runtime representation of your terms to ensure that the coercion is a no-op at runtime).

With elaboration, unification doesn't return a coercion function, but has the side-effect of instantiating unification variables to make a and b equal.

For me this makes it clearer when subtyping applies and when you have to fall back to unification, for instance.

atennapel commented 6 years ago

Thanks! I hadn't thought of it like that, that does make it clearer for me as well :)

andrewthad commented 4 years ago

Just dumping some thoughts here since this is the only real discussion of this question that I can find. What about sidestepping the variance issue entirely by only allowing type constructors that are covariant in all arguments. This would mean that the function arrow would not be considered a type constructor like it is in Haskell.

data Foo a = Foo a a   // Allowed
data Foo a = Foo // Allowed
data Foo a = Foo (a -> Int) // Not allowed

With this restriction in place, is there a natural and easy extension of the system described in the paper? It would be nice even to have this important subset of type constructors without having to jump through all the extra hoops that the followup paper Sound and Complete Bidirectional Typechecking for Higher-Rank Polymorphism with Existentials and Indexed Types jumps through. What would be needed? I think that Figure 2 of the original paper would need kinds. Maybe polymorphic kinds can be left out. It would be necessary to add kinds and type applications.

Kinds K ::= * | * ~> * // This is new
Types A,B,C ::= 1 | α | ∀α. A | A→B
Monotypes and Constructors τ,σ ::= 1 | α | τ→σ | τ σ // Application is new, no quantifiers inside application
Contexts Ψ ::= · | Ψ,α | Ψ,x:A

I'm not sure if that should be added to types or to monotypes. A new subtyping rule (in Figure 3) is needed:

Ψ A1 ≤ B1    Ψ A2 ≤ B2
-------------------------------
    Ψ A1 A2 ≤ B1 B2

Maybe figure 4 (the declarative typing system) doesn't change at all? I'm stuck now. I need to read more, and I might come back to this thought later.

atennapel commented 4 years ago

I think Agda only allows covariant type constructors. In Haskell though non-covariant datatypes are used, so it might be very restrictive (for example Fix). I'm pretty sure just adding that new subtyping rule you showed is enough.

Personally, I don't like the use of subsumption of "Sound and Complete..." anymore. Haskell is also moving away from it (with Quick Look Polymorphism) and making function arrows invariant (instead of using "deep skolemization"). This makes type inference simpler and I think it also makes inference of impredicative polymorphism easier. I prefer to have some support for impredicative polymorphism instead of the subtyping from "Sound and Complete...". In the Quick Look paper they also analyzed the impact of moving to invariant arrows and it turns out that only like 3% of the 2332 packages had some issues, most of them solved by some eta-abstraction.

Anyway, sorry for going off-topic. I just think that subsumption may not be worth the trouble. Sticking to just unification probably works just fine.

andrewthad commented 4 years ago

I'm pretty sure just adding that new subtyping rule you showed is enough.

I don't think that just the subtyping rule is enough because I didn't add anything that actually performs kind checking. Somewhere, there has to be something that checks, when the application A B shows up, that A has kind k -> * and B has kind k. That probably has to show up somewhere in the declarative typing system, and then I think that would impact the algorithmic system as well. But I don't see exactly how, and it seems like type checking and kind checking would need to be interleaved.

Concerning the variance of function arrows, even Dunfield and Krishnaswami kind of gave up on the clever treatment of arrows in the next paper:

In logical terms, functions and guarded types are negative; sums, products and assertion types are positive. We could potentially operate on these types in the negative and positive subtype relations, respectively. Leaving out (for example) function subtyping means that we will have to do some η-expansions to get programs to typecheck; we omit these rules to keep the implementation complexity low.

Are there other papers you would recommend about bidirectional checking for type-scheme flavored System F using unification? I've done type checking of vanilla System F before but not type inference (where is seems like everyone uses type schemes instead of type assignment).

atennapel commented 4 years ago

Oh sorry, yes you are very right. You need to perfom kind checking. There already is a well-formedness check for types in the system, so that check could also do kind checking. So for example in the annotation type rule:

C |- t WF (check that t is well-formed and well-kinded)
C |- e <= t
---
C |- (e : t) => t

You can assume any types in the context are well-kinded and checking against a type is also fine. Only in synthesis rules where the user gave a type do you have to check for well-formedness and well-kindedness.

I think if you use "Sound and Complete..." and swap subsumption for unification, you'll have a nice algorithm. Unification is a bit simpler, you just never instantiate foralls, only skolemize. Added bonus is that function arrow can just be another type constructor (although for efficiency reasons you might not want to do that). I'm not really aware of papers on this, sorry.

One way to simplify the algorithm of "Sound and Complete..." (implementation-wise) is to have meta-variables carry around a list of type variables that are in scope. Then you are only allowed to unify a meta-variable with a type (skolem) variable if that variable appears in the list of the meta-variable. Now you don't need to use the context anymore to correctly scope the meta and type variables. Though I'm not sure if this causes other issues. This is similar to how it is done for higher-order unification in dependently type languages. Then the context can become immutable and you just store the meta-variables in a mutable map. I have an old implementation of this idea in Javascript here: https://github.com/atennapel/bidirectional.js/blob/master/scoped.fc.js

For example:

|- \x. (\y. y) x <= forall t. t -> t (introduce skolem for t)
t |- \x. (\y. y) x <= t -> t (checking lambda against function type)
t, x : t |- (\y. y) x <= t (synthesize function)

t, x : t |- \y. y => ?x[t] -> ?x[t] (introduce meta which can only refer to t)
t, x : t, y : ?x[t] |- y => ?x[t]

t, x : t |- (?x[t] -> ?x[t]) @ x => t (synthapp rule)
t, x : t |- x <= ?x[t]

t ~ ?x[t] (unify)
?x[t] := t (it's allowed to assign t to ?x, because t is in [t])

That's just an idea I had to make the implementation of type inference for System F simpler. I kind of moved on to dependently typed languages now.

Anyway I'm rambling a bit now, feel free to ask me whatever, although I'm not an expert on the matter.

andrewthad commented 4 years ago

This is very much a non sequitur, but I started thinking about this again yesterday and started wondering about how the covariant-type-constructors-only world implied by a rule like

Ψ A1 ≤ B1    Ψ A2 ≤ B2
-------------------------------  TYCONAPP
    Ψ A1 A2 ≤ B1 B2

would interact with mutable references. Normally, in systems with subtyping, mutable references have to treat their parameters invariant. For example, with some kind of refinement typing, if you had:

type Ref : * -> *
newRef : a -> IO (Ref a) // Monadic interface to effectful computations like in Haskell
myRef : Ref {i : Int | i < 5}

You cannot use myRef as a reference of type Ref {i : Int | i < 4} or of type Ref {i : Int | i < 6} because it leads to unsoundness. However, in Sound and Complete + TYCONAPP (defined above), it seems like Ref could treat its argument covariantly, and everything would be sound. You end up with these subtyping relationships:

∀a. Ref a ≤ Ref Int
∀a. Ref a ≤ Ref Char

Which seems like it would imply unsoundness since the same reference could be used as both an integral and character reference (begetting unsafeCoerce), but we may be saved the fact that ∀a. Ref a is not actually inhabited. In the GHC lore, there's an old trick where unsafeCoerce can be implemented using undefined and unsafePerformIO. (In a strict language, I don't think this trick would even work.) I think this is related.

Anyway, the question here is "is it sound to treat the type argument of Ref covariantly?" I think it is, but I'm curious to know if unsoundness arises from doing this.

andrewthad commented 3 years ago

I'm writing this down mostly for my own sake so that I don't forget it. From an earlier comment:

One way to simplify the algorithm of "Sound and Complete..." (implementation-wise) is to have meta-variables carry around a list of type variables that are in scope. Then you are only allowed to unify a meta-variable with a type (skolem) variable if that variable appears in the list of the meta-variable. Now you don't need to use the context anymore to correctly scope the meta and type variables. Though I'm not sure if this causes other issues.

This is clever, although it doesn't totally work, and only after trying something similar do I now fully appreciate it. The situations where it works are things like:

Γ, ^α, ^β,x : ^α   |-   e ⇐ ^β   -|  ∆,x : ^α,Θ
-------------------------------------------------------   →I⇒
        Γ   |-   λx.e ⇒ ^α → ^β   -|    ∆

The paper doesn't really list any restrictions on the relationship between the input context Γ and tho output context ∆. But what is the relationship? The big this is that unsolved existentials may have been solved. But what about it's general shape? Should that be the same or not. The only shape changes could come from articulation (^αApp), which introduces more variables. Articulation causes this scheme to be incompatible with an implementation in which metavariables carry around a list of what they could be solved to. More metavars might get added to that context by articulation, and they would be missing from the lists of everything that would've been to their right in the ordered context.

One way to remedy this is to make functions not first class. If you cannot instantiate a metavariable with a function, then there is no ^αApp rule. Similarly, InstRArr and InstLArr would go away, and I think those are the only rules that cause contexts to be manipulated in the middle rather than at the tail.