Closed xsebek closed 3 weeks ago
This seems like a reasonable idea. Off the top of my head, to implement this I guess we should do something like:
let
expression (which includes def
as syntax sugar) with a type annotation.
ScopedTypeVariables
is only triggered with an explicit use of forall
; if we wanted that it would require reworking our AST + parser a bit, since at the moment parsing a -> a
and forall a. a -> a
both yield exactly the same AST.forall a. a -> a
would shadow in-scope type variables.)In Haskell, of course (1) ScopedTypeVariables
is off by default, (2) even when turned on it is only triggered by use of explicit forall
, and (3) these days I get the sense that it is generally looked down upon in favor of more newfangled approaches with explicit type application and so on. However, I don't know of any good reasons for (1) and (2), and (3) doesn't really apply in our case since we don't compile down to some kind of polymorphic lambda calculus with explicit type passing like Haskell does. In addition, in our case type signatures always have to go right next to definitions; in Haskell they can be far apart which complicates the issue.
See also #2102 which wouldn't solve the issue but is relevant here.
The problem I run into is the following limitation of records:
In the record projection
rn.x
, can't infer whether the LHS has a record type. Try adding a type annotation.
So I have to provide a type annotation, but I also can't write it:
tydef RBTree k v = rec b. Unit + [c: Bool, k: k, v: v, l: b, r: b] end
tydef RBNode k v = rec n. [c: Bool, k: k, v: v, l: Unit + n, r: Unit + n] end
def balance : RBNode k v -> RBTree k v = \t.
let baseCase /* : RBTree k v */ = inr t in
let balanceRR /* : RBNode k v -> RBTree k v */ = \rn.
if true {baseCase} {undefined rn.x} // can't infer r.x
in undefined
end
@byorgey the only workaround I found was passing the t
original parameter around so that the types would match.
This effectively prevents me from using any local variables that are records.
@byorgey btw. ScopedTypeVariables
is on by default in GHC2024 and GHC2021. 🙂
I think it is a very beginner-friendly extension that "does what I meant". 😄
The problem I run into is the following limitation of records:
That is a very good motivating use case!
@byorgey btw.
ScopedTypeVariables
is on by default in GHC2024 and GHC2021. 🙂
I didn't know that! That makes me feel better, that there aren't lurking downsides we're not thinking of.
So here's what I'm thinking in terms of a detailed design. Comments welcome!
forall
or not, brings all its variables into scope in the body of its definition.forall
are introduced as new variables, and in particular they will shadow any type variables already in scope with the same names.forall
which are not already in scope will be implicitly quantified. Any such type variable names which are in scope will simply refer to the name in the enclosing scope.forall
we require an "all or nothing" approach --- if you use an explicit forall
, then you must list all the type variables in the type, or else you get an error. But given the above about shadowing etc., I think it makes more sense to get rid of this restriction. Variables listed in an explicit forall
are bound fresh; variables not listed will either be implicitly quantified (if they are not already in scope) or will just refer to the enclosing scope (if they are).
So, for example, this will work:
def balance : RBNode k v -> RBTree k v = \t.
let baseCase : RBTree k v = inr t in
let balanceRR : RBNode k v -> RBTree k v = ...
in undefined
end
The type signature on balance
will bring k
and v
into scope in its body (even though there is not an explicit forall
written); then the k
and v
in the type signatures on baseCase
and balanceRR
will both refer to the k
and v
bound in the type of balance
. In particular, this means that inr t
is a valid definition for baseCase
since their types match. On the other hand, if one were to write
def balance : RBNode k v -> RBTree k v = \t.
let baseCase : forall k v. RBTree k v = inr t in
undefined
end
then the k
and v
in the type of baseCase
would be new type variables which are different than the k
and v
in the type of balance
(which would now be shadowed within the definition of baseCase
). This would now be a type error (which is what happens currently), since baseCase
is required to be an RBTree k v
for any types k
and v
, but inr t
only has the specific type that was given as an input to balance
.
Finally, if one wrote
def balance : RBNode k v -> RBTree k v = \t.
let baseCase : forall k. RBTree k v = inr t in
undefined
end
then only k
would be a new type variable, whereas v
would refer to the v
from the type of balance
. (This would still be a type error, of course.)
A plan for actually implementing this:
Ctx Var
to the typechecker which maps type variable names to the skolem variable names chosen for them. This is used to track not only which variables are in scope, but also which skolem variable name corresponds to each.@byorgey Thanks for the detailed breakdown. 👍
Adding type annotations for local functions is so rare that I would not even mind having to always write explicit forall. 🙂
I would not even mind having to always write explicit forall.
Fair enough. Requiring explicit forall wouldn't really make the implementation any easier, though (actually it might be the opposite), so unless you know of good reasons that we should require explicit forall
like Haskell does, let's not worry about it.
I have an implementation which is almost working, just a bug or two to track down yet...
Is your feature request related to a problem? Please describe.
When writing a complex function, it is helpful to use local type annotations. But this breaks when I use local variables:
This is the same thing as Haskell without
ScopedTypeVariables
:Describe the solution you'd like
I would like to have scoped type variables by default.
Describe alternatives you've considered
It's possible that this would cause problems for later definitions if they were technically inside the scope. An alternative would be using a type annotation inside the body, like the Haskell example:
Additional context
I discovered this issue while writing #2161. The only workaround is to not specify the local definition signature or pass parameters with the local types.