swarm-game / swarm

Resource gathering + programming game
Other
838 stars 52 forks source link

Make type variables scoped #2168

Closed xsebek closed 3 weeks ago

xsebek commented 1 month ago

Is your feature request related to a problem? Please describe.

When writing a complex function, it is helpful to use local type annotations. But this breaks when I use local variables:

def id : forall x. x -> x = \x.
 let f : x = x in f  // From context, expected `x` to have type `s2`, but it actually has type `s1`
end

This is the same thing as Haskell without ScopedTypeVariables:

idx :: forall x. x -> x
idx x = let y :: x
            y = x
        in y  -- Couldn't match expected type ‘x1’ with actual type ‘x’ ...

Describe the solution you'd like

I would like to have scoped type variables by default.

Describe alternatives you've considered

It's possible that this would cause problems for later definitions if they were technically inside the scope. An alternative would be using a type annotation inside the body, like the Haskell example:

f :: [a] -> [a]
f (xs :: [aa]) = xs ++ ys
  where
    ys :: [aa]
    ys = reverse xs

Additional context

I discovered this issue while writing #2161. The only workaround is to not specify the local definition signature or pass parameters with the local types.

byorgey commented 1 month ago

This seems like a reasonable idea. Off the top of my head, to implement this I guess we should do something like:

  1. Keep track of an additional context while typechecking, to remember which type variable names are in scope. (Edited to add: I think this also needs to map in-scope type variable names to the skolem variables that were generated for them.)
  2. Stick type variables into this context when checking a let expression (which includes def as syntax sugar) with a type annotation.
    • Note that in Haskell, ScopedTypeVariables is only triggered with an explicit use of forall; if we wanted that it would require reworking our AST + parser a bit, since at the moment parsing a -> a and forall a. a -> a both yield exactly the same AST.
  3. When checking anything else with a polymorphic type signature, don't implicitly generalize over any type variables which are already in scope. (But explicit generalization, as in forall a. a -> a would shadow in-scope type variables.)

In Haskell, of course (1) ScopedTypeVariables is off by default, (2) even when turned on it is only triggered by use of explicit forall, and (3) these days I get the sense that it is generally looked down upon in favor of more newfangled approaches with explicit type application and so on. However, I don't know of any good reasons for (1) and (2), and (3) doesn't really apply in our case since we don't compile down to some kind of polymorphic lambda calculus with explicit type passing like Haskell does. In addition, in our case type signatures always have to go right next to definitions; in Haskell they can be far apart which complicates the issue.

See also #2102 which wouldn't solve the issue but is relevant here.

xsebek commented 1 month ago

The problem I run into is the following limitation of records:

In the record projection rn.x, can't infer whether the LHS has a record type. Try adding a type annotation.

So I have to provide a type annotation, but I also can't write it:

tydef RBTree k v = rec b. Unit + [c: Bool, k: k, v: v, l: b, r: b] end
tydef RBNode k v = rec n. [c: Bool, k: k, v: v, l: Unit + n, r: Unit + n] end

def balance : RBNode k v -> RBTree k v = \t.
  let baseCase /* : RBTree k v */ = inr t in
  let balanceRR /* : RBNode k v -> RBTree k v */ = \rn.
    if true {baseCase} {undefined rn.x}  // can't infer r.x
  in undefined
end

@byorgey the only workaround I found was passing the t original parameter around so that the types would match. This effectively prevents me from using any local variables that are records.

xsebek commented 1 month ago

@byorgey btw. ScopedTypeVariables is on by default in GHC2024 and GHC2021. 🙂

I think it is a very beginner-friendly extension that "does what I meant". 😄

byorgey commented 1 month ago

The problem I run into is the following limitation of records:

That is a very good motivating use case!

@byorgey btw. ScopedTypeVariables is on by default in GHC2024 and GHC2021. 🙂

I didn't know that! That makes me feel better, that there aren't lurking downsides we're not thinking of.

byorgey commented 1 month ago

So here's what I'm thinking in terms of a detailed design. Comments welcome!

So, for example, this will work:

def balance : RBNode k v -> RBTree k v = \t.
  let baseCase : RBTree k v = inr t in
  let balanceRR : RBNode k v -> RBTree k v = ...
  in undefined
end

The type signature on balance will bring k and v into scope in its body (even though there is not an explicit forall written); then the k and v in the type signatures on baseCase and balanceRR will both refer to the k and v bound in the type of balance. In particular, this means that inr t is a valid definition for baseCase since their types match. On the other hand, if one were to write

def balance : RBNode k v -> RBTree k v = \t.
  let baseCase : forall k v. RBTree k v = inr t in
  undefined
end

then the k and v in the type of baseCase would be new type variables which are different than the k and v in the type of balance (which would now be shadowed within the definition of baseCase). This would now be a type error (which is what happens currently), since baseCase is required to be an RBTree k v for any types k and v, but inr t only has the specific type that was given as an input to balance.

Finally, if one wrote

def balance : RBNode k v -> RBTree k v = \t.
  let baseCase : forall k. RBTree k v = inr t in
  undefined
end

then only k would be a new type variable, whereas v would refer to the v from the type of balance. (This would still be a type error, of course.)

A plan for actually implementing this:

xsebek commented 1 month ago

@byorgey Thanks for the detailed breakdown. 👍

Adding type annotations for local functions is so rare that I would not even mind having to always write explicit forall. 🙂

byorgey commented 1 month ago

I would not even mind having to always write explicit forall.

Fair enough. Requiring explicit forall wouldn't really make the implementation any easier, though (actually it might be the opposite), so unless you know of good reasons that we should require explicit forall like Haskell does, let's not worry about it.

I have an implementation which is almost working, just a bug or two to track down yet...