Gabriella439 / Haskell-Morte-Library

A bare-bones calculus-of-constructions
BSD 3-Clause "New" or "Revised" License
373 stars 25 forks source link

How exactly can λ-encoded datatypes be compiled to primitive operations? #56

Open VictorTaelin opened 7 years ago

VictorTaelin commented 7 years ago

Suppose you have the following compile target, optimized for performance:

  data Term
    = Var Int
    | Lam Term
    | App Term Term
    | Nat Integer
    | Add Term Term
    | Mul Term Term

  nf :: Term -> Term

I'm wondering how exactly, if possible, one could compile a Morte term to that language, reduce it, then read back to get a Morte term on the normal form. I think what is needed is something like that:

  reduce :: Expr -> Expr -> Expr
  reduce val typ = fromTerm typ . nf . toTerm $ val

So, toTerm :: Expr -> Term needs to erase the types, convert λ-encoded nats and nat operations to the respective primitives. fromTerm :: Expr -> Term -> Expr must re-add the types and convert native operations back to λ-encodings.

  1. Is that correct?
  2. Should primitive<->lambda conversions be based on syntactical equality? E.g., a term typed ∀ a . (a -> a) -> a -> a becomes a NAT, the function add : Nat -> Nat -> Nat becomes Add?
  3. Once it is done, how do I re-annotate the types?
Gabriella439 commented 7 years ago

First, morte has no way to efficiently store an Integer literal in the syntax tree, which is the fundamental issue here. You need some way to encode the Integer literals in lambda calculus like a binary numeral encoding.

For the simplicity of this post, I'll use a peano-numeral encoding for just natural numbers.

So, assuming that you have natural numbers encoded as:

$ cat Nat
    ∀(Nat : *)
→   ∀(Succ : Nat → Nat)
→   ∀(Zero : Nat)
→   Nat
$ cat Succ
    λ(n : ./Nat )
→   λ(Nat : *)
→   λ(Succ : Nat → Nat)
→   λ(Zero : Nat)
→   Succ (n Nat Succ Zero)
$ cat Zero
    λ(Nat : *)
→   λ(Succ : Nat → Nat)
→   λ(Zero : Nat)
→   Zero

... then the encoding of the Term syntax tree would be:

$ cat Term
    ∀(Term : *)
→   ∀(Lit : ./Nat → Term)
→   ∀(Add : Term → Term → Term)
→   ∀(Mul : Term → Term → Term)
→   Term
$ cat Lit
    λ(n : ./Nat )
→   λ(Term : *)
→   λ(Lit : ./Nat → Term)
→   λ(Add : Term → Term → Term)
→   λ(Mul : Term → Term → Term)
→   Lit n
$ cat Add
    λ(x : ./Term )
→   λ(y : ./Term )
→   λ(Term : *)
→   λ(Lit : ./Nat → Term)
→   λ(Add : Term → Term → Term)
→   λ(Mul : Term → Term → Term)
→   Add (x Term Lit Add Mul) (y Term Lit Add Mul)
$ cat Mul
    λ(x : ./Term )
→   λ(y : ./Term )
→   λ(Term : *)
→   λ(Lit : ./Nat → Term)
→   λ(Add : Term → Term → Term)
→   λ(Mul : Term → Term → Term)
→   Mul (x Term Lit Add Mul) (y Term Lit Add Mul)

... and something like 0 + (1 * 2) would be represented as:

$ morte > ./example
./Add
    (./Lit ./Zero )
    (./Mul    
        (./Lit (./Succ ./Zero ))
        (./Lit (./Succ (./Succ ./Zero ))) )
<Ctrl-D>
∀(Term : *) → ∀(Lit : (∀(Nat : *) → ∀(Succ : Nat → Nat) → ∀(Zero : Nat) → Nat) → Term) → ∀(Add : Term → Term → Term) → ∀(Mul : Term → Term → Term) → Term

$ cat ./example
λ(Term : *) → λ(Lit : (∀(Nat : *) → ∀(Succ : Nat → Nat) → ∀(Zero : Nat) → Nat) → Term) → λ(Add : Term → Term → Term) → λ(Mul : Term → Term → Term) → Add (Lit (λ(Nat : *) → λ(Succ : Nat → Nat) → λ(Zero : Nat) → Zero)) (Mul (Lit (λ(Nat : *) → λ(Succ : Nat → Nat) → Succ)) (Lit (λ(Nat : *) → λ(Succ : Nat → Nat) → λ(Zero : Nat) → Succ (Succ Zero))))

Then you don't need a toTerm function since the resulting expression is already a valid morte Expr. A Term is just a morte Expr of type ./Term.

You can write your Term directly using the ./Add, ./Mul and ./Lit constructors. The normalized expression will be an unevaluated syntax tree that you can pass to a high-efficiency evaluator with built-in support for machine arithmetic (assuming that you replace the peano numeral encoding of numbers with a more efficient encoding like binary).

Once your high-efficiency evaluator is done you read out the result of the evaluator as a morte Expr of type ./Term using the same constructors and then you can begin converting that back to lambda calculus.

fromTerm is implemented within morte itself. First we just need to implement the lamba calculus equivalents of addition and multiplication:

$ cat plus
    λ(x : ./Nat )
→   λ(y : ./Nat )
→   λ(Nat : *)
→   λ(Succ : Nat → Nat)
→   λ(Zero : Nat)
→   x Nat Succ (y Nat Succ Zero)
$ cat times
    λ(x : ./Nat )
→   λ(y : ./Nat )
→   λ(Nat : *)
→   λ(Succ : Nat → Nat)
→   x Nat (y Nat Succ)

... and then fromTerm is an ordinary morte function:

$ cat fromTerm
    λ(t : ./Term )
→   t ./Nat (λ(x : ./Nat ) → x) ./plus ./times

Here's an example of fromTerm in action:

$ morte
./fromTerm ./example
<Ctrl-D>
∀(Nat : *) → ∀(Succ : Nat → Nat) → ∀(Zero : Nat) → Nat

λ(Nat : *) → λ(Succ : Nat → Nat) → λ(Zero : Nat) → Succ (Succ Zero)