shinh / elvm

EsoLangVM Compiler Infrastructure
MIT License
1.13k stars 143 forks source link

Grass backend #123

Closed woodrush closed 1 year ago

woodrush commented 1 year ago

As suggested by @shinh in https://github.com/shinh/elvm/pull/118#issuecomment-1273060925, a Grass backend would be exciting.

I'm currently developing a Grass backend at my repo GrassVM, based on LambdaVM. The VM itself is already working, and I've managed to make rot13.w.

The remaining task is to generate the assembly listing and the memory initialization list in ELVM using C. The current GrassVM code uses plant included in @susisu's Grassy toolkit to generate rot13.w, which transpiles an OCaml-like language to Grass.

I currently believe that generating the assembly list faces a tradeoff of either using a lot of the stack or requiring a tremendous amount of code size having lots of W and ws. To optimize the code size, we would use lots of vs to refresh the De Bruijn index once in a while, but that would use a lot of the main environment stack. To optimize the stack size, we would squeeze a lot of code in one v definition clause, but that would increase the De Bruijn index in the sub-environment, probably making the code size increase quadratically with the instruction length. I'll first try if the code size optimized version works. I'll be happy to discuss ideas and collaborate on building this backend.

woodrush commented 1 year ago

The basic design of GrassVM is based on LambdaVM which was originally written for a language with a lazy evaluation strategy, such as Lazy K. The assembly listing is expressed as a list of lambdas that encodes the instructions. This design was used since languages such as Lazy K did not have I/O functions with side effects, and continuation-passing style was required to handle I/O. This requirement is largely alleviated in Grass since Grass is an eagerly evaluated language having I/O functions with side effects.

On the other hand, I'm aware that the Unlambda backend by @irori generates functions that are directly applied to the VM's global state, as described here. Since Unlambda also uses eager evaluation which is the same as Grass, I suspect that adopting this design could lead to further optimizations, although the code generation should be significantly more difficult, and the VM probably must be rewritten from scratch again.

I'm also not sure if the code size / stack size tradeoff is something that can be avoided or is inherent in the language specifications. I'll therefore focus first on making something that works, at least for small programs (it may work on large programs as well but we'll need to see).

woodrush commented 1 year ago

lisp.c is working on my Grass backend branch. Although large programs run slowly, they do work. I'll be doing some optimizations to make it run faster.