gelisam / klister

an implementation of stuck macros
BSD 3-Clause "New" or "Revised" License
130 stars 11 forks source link

The kind-checker allows types to be applied to too many arguments #194

Open gelisam opened 1 year ago

gelisam commented 1 year ago

addDatatypePrimitive currently fails with a syntax error instead of a kind error when a datatype is used with more type arguments than needed, e.g. (the (Maybe Unit Unit) (just unit)). We would like to remove the check which causes a syntax error and rely on the kind checker to catch this issue, but unfortunately the kind checker does not catch this issue. It should.

Originally discussed here.

david-christiansen commented 1 year ago

Excellent!

It looks to me like the check is happening as it should for the ETypeVar case in expandOneForm, with the calls to equateKinds going on, and that it just needs to be added to the EPrimTypeMacro case. The way I'd probably do it is to add a kind to the EPrimTypeMacro constructor, which would allow a single centralized kind check rather than sticking it in each implT implementation, which would be annoying.

david-christiansen commented 1 year ago

The downside of doing it this way is that each type primitive loses the ability to have completely custom syntax. Perhaps that's a price we're willing to pay :-) The implT would then take a list of type destinations as arguments instead of a syntax object, because expandOneForm would have already set up the arguments as part of the kind checking process. This seems like a good compromise to simplify the code and prevent errors to me.

david-christiansen commented 1 year ago

(though it will no longer work if we ever get primitive type syntax that introduces bindings, like an explicit forall or pi or whatever)

gelisam commented 1 year ago

I would definitely like explicit foralls! Currently we cannot even express the Monad dictionary, we have the equivalent of

class Monad m a b where
  (>>=) :: m a -> (a -> m b) -> m b

instead of

class Monad m where
  (>>=) :: forall a b. m a -> (a -> m b) -> m b
david-christiansen commented 1 year ago

I would love to have that, but it would mean going beyond Hindley-Milner. Which isn't a bad thing - it would increase the compellingness of the Klister argument - but I don't know how to do the higher-rank stuff in as canonical of a way here. I'd probably do something like the Dunfield-Krishnaswami approach, but that would be a fairly major re-architecting of the system.

gelisam commented 1 year ago

How about System F style polymorphism, with explicit big Lambda and type applications to clarify where the forall is allowed to be instantiated? This should make things much easier to implement. And with enough macro magic, most of the convenience of Hindley-Milner can be recovered!

More precisely, I propose that:

  1. New expression primitives, type-lambda and type-apply, are added to the kernel, with the usual System F semantics.
  2. In the kernel, top-level binders like define should fail with an "ambiguous type" error if the inferred type contains a unification variable.
  3. In the kernel, there is no let-generalization, so

    (let [id (lambda (x) x)]
      (let [_ (id 42)]
        id))

    is inferred to have type Integer -> Integer, not ∀(α : *). (α → α).

  4. A new expression primitive is added, generalize-type, which does the equivalent of let generalization. An explicit type-apply is needed in order to instantiate the unification variables which have been generalized to a forall, which is unusual because they have not been introduced by a type-lambda.

    Let's go through an example.

    (let [id1 (lambda (x) x)]
      (let [id2 (generalize-type
                  (lambda (x y)
                    (pair (id1 x) y))]
        (pair (id1 42)
          (pair ((type-apply id2 (Integer)) 43 44)
            ((type-apply id2 (String)) 45 "foo")))))

    The type-checker would generate the following constraints:

    id1 : ?1
    ?1 ~ ?2 -> ?2                 -- (lambda (x) x)
    id2 : ?3
    -- start generalization with (lambda (x y) (pair (id1 x) y)) : ?4
    ?4 ~ ?5 -> ?6 -> ?7           -- (lambda (x y) ...)
    ?7 ~ Pair ?8 ?9               -- (pair ... ...)
    ?8 ~ ?2                       -- (pair (id1 ...) ...)
    ?5 ~ ?2                       -- (id1 x)
    ?9 ~ ?6                       -- (pair ... y)
    -- end generalization with (lambda (x y) (pair (id1 x) y)) : ?2 -> ?6 -> Pair ?2 ?6
    -- ?2 was created outside the generalization block and is thus rigid. ?6 can be generalized.
    ?3 : ∀α. ?2 -> α -> Pair ?2 α
    ?2 ~ Integer                  -- (id1 42)
    (type-apply id2 (Integer)) : ?2 -> Integer -> Pair ?2 Integer
    ?2 ~ Integer                  -- ((type-apply ...) 43 ...)
    Integer ~ Integer             -- ((type-apply ...) ... 44)
    (type-apply id2 (String)) : ?2 -> String -> Pair ?2 String
    ?2 ~ Integer                  -- ((type-apply ...) 45 ...)
    String ~ String               -- ((type-apply ...) ... "foo")
  5. In the prelude, top-level binders like define mimic the Hindley-Milner behaviour; at the definition site, they call generalize-type, and at the call site, the forall is automatically specialized by calling type-apply with a fresh type variable. When the identifier is passed as an argument to a higher-order function, type-apply is not automatically inserted, and the user must decide whether they want to give that higher-order function a value whose type begins with a forall, of if they want to first specialize that value using type-apply. This differs from Hindley-Milner, which would also specialize the forall when the identifier is given as an argument to a higher-order function, because with Hindley-Milner that function is guaranteed not to expect a forall type.

    Let's look at an example:

    (prelude.define id (lambda (x) x))
    
    (example (id 42))
    (example (id id))
    (example (id (type-apply id (String))))

    expands to:

    (kernel.define raw-id
      (generalize-type
        (lambda (x) x)))
    (define-macros
      ([id
        (lambda (stx)
          (case (open-syntax stx)
            [(list-contents (cons _ args))
             (pure `(with-unknown-type [A]
                      ((type-apply raw-id A) ,@args)) -}]
            [(identifier-contents ...)
             (pure 'kernel.id)]))]))
    
    (example
      (with-unknown-type [A]  -- A ~ Integer
        ((type-apply raw- A) 42)))
    (example
      (with-unknown-type [A]  -- A ~ ∀(α : *). (α → α)
        ((type-apply raw-id A) raw-id)))
    (example
      (with-unknown-type [A]  -- A ~ (String → String)
        ((type-apply id A) (type-apply raw-id (String)))))
  6. In the prelude, let-generalization is supported, using the same technique.