pvdrz / pijama_legacy

A functional programming language
MIT License
47 stars 4 forks source link

Proposal: Introduce Llama (LIR replacement) #77

Open pvdrz opened 4 years ago

pvdrz commented 4 years ago

Everything stated here is open to discussion and revision for anyone who wants to give their opinion, except for the name, that's non-negotiable.

Introduction

Following the spirit of https://github.com/christianpoveda/pijama/issues/68. I've been reading papers (and wikipedia articles) about implementing a better evaluation model for Pijama. I think an interesting direction is implementing an abstract machine (like the SECD machine or the STG machine).

However, our current low-level intermediate representation is ill suited for that model of evaluation. I propose we move to an IR that is easier to debug and execute.

The main differences with our current LIR would be:

Named representation

The LIR uses de Bruijn indices to tag variables. This has nice theoretical properties but is hard to read and debug.

Llama would keep the names provided from the AST. This simplifies evaluation a lot because it is no longer necessary to shift indices when evaluating a function.

This means we would have to have a notion of scoping during evaluation, but we already have that in the AST analysis module and in the type checking module.

N-ary abstractions

The MIR and LIR represent functions as abstractions with a single parameter. That's fine for the MIR because type checking is easier to do in that model (in particular curried functions are really easy to type).

However, in the LIR this produces performance issues because replacing the parameters of a function by its actual arguments requires a full evaluation step for each parameter.

With n-ary abstractions we could do that in a single step which would make the language considerably fast.

Let bindings

In the LIR we represented let bindings as derived forms using abstractions.

Llama would have let bindings and each program would be represented as a sequence of let bindings where there is an special binding for the "main" of the program.

Shallower conditionals

After #74 we introduced elif conditionals but we represent those in the LIR as nested if x then y else z conditionals.

Llama would represent such conditionals as in the AST. As a vector of branches. Meaning that evaluating a chain of elif conditionals could be done faster.

Native recursion

The LIR handles recursion using a fixed-point operator, which clones the whole body of the function on each iteration.

In Llama, recursion would be tagged in the let binding of a function and would be executed in a more sequential, imperative way. Avoiding the extra cloning we have now.

Call-by-need

We could finally start thinking in doing call-by-need evaluation. Each term of Llama would have a tag to state if its in its normal form (completely evaluated) or not. With this tag, we would be able to evaluate terms when they are needed but avoid evaluating the same term twice.

Separation between program state and code

The LIR represents both the state and code of the program. Llama would be used just to represent the code we need to execute. The state of the program would be stored inside Machine and each Llama term would trigger a particular instruction of the machine.

Cooler name

LIR sounds fancy and technical but let's be honest, llamas are cooler... image and also llama rhymes with pijama.

Formal structure

TODO

Evaluation

TODO

DarkDrek commented 4 years ago

I support this name/idea evidence