Maru is a programming language. It's a self-hosting, yet tiny lisp dialect: a symbolic expression evaluator that can compile its own implementation to machine code, in about 2000 LoC altogether.
Maru is in particular trying to be malleable at the very lowest levels, so any special interest that cannot be accommodated easily within the common platform would be a strong indicator of a deficiency within the platform that should be addressed rather than disinherited. (Ian Piumarta)
This repo is also a place for exploration in the land of bootstrapping and computing system development. My primary drive with Maru is to clearly and formally express that which is mostly treated as black magic: the bootstrapping of a language on top of other languages (which includes the previous developmental stage of the same language).
This document aims to present an overview of Maru. There are various
documents in the doc/
directory that discuss some topics in
more detail.
Maru's architecture is described in doc/how.md.
To test a bootstrap cycle using one or all of the backends:
make test-bootstrap-x86 # defaults to the libc platform
make PLATFORM=[libc,linux] test-bootstrap[-llvm,-x86]
Originally written by Ian Piumarta,
at around 2011. Full commit history is available in the
piumarta
branch.
The current gardener is attila@lendvai.name.
Bugs and patches: maru github page.
Discussion: maru-dev google group.
Programming badly needs better foundations, and Maru is part of this exploration. The foundations should get smaller, simpler, more self-contained, and more approachable by people who set out to learn programming.
I'm fascinated by bootstrapping issues. We lose a lot of value by not capturing the history of the growth of a language, including the formal encoding of its build instructions. They are useful both for educational purposes, and also for practical reasons: to have a minimal seed that is very simple to port to a new architecture, and then have a self-contained, formal bootstrap process that can automatically "grow" an entire computing system on top of that freshly laid, tiny foundation.
Ian seems to have abandoned Maru, and his published archive couldn't be run as-is. But it's an interesting piece of code that deserves a repo and a maintainer to keep bitrot at bay.
This work is full of puzzles that are a whole lot of fun to solve!
You are very welcome to contribute, but beware that until further notice
this repo will receive forced pushes (i.e.
git push -f
rewriting git history (except the piumarta
branch)). This will stop
eventually when I settle with
a build setup that nicely facilitates bootstrapping multiple, parallel paths of
language development. Please make sure that you open a branch for your work,
and/or that you are ready for some git fetch
and git rebase
.
Backporting and bootstrapping the latest semantics from the piumarta
branch is done: the eval.l
in the latest branch of this repo should
be semantically equivalent with the eval.l
that resides in the
piumarta
branch, although, we have arrived to this state on two
different paths:
Ian, while evolving Maru, kept his eval.c
and eval.l
semantically in sync
in contrast, I have bootstrapped the new features: I started out
from a minimal version of the eval.l
+ eval.c
couple (the
original version
published on Ian's website). Then I bootstrapped the features of
the later stages of eval.l
using an earlier stage of itself. I
only use the 2300 LoC of throwaway C code as the initial stepping
stone in the bootstrap process, but once the first step is made
the C code is left behind.
There is one major bug left that I failed to fix while I was actively hacking on Maru. It's discussed in https://github.com/attila-lendvai/maru/issues/8.
There are several Maru stages/branches now, introducing non-trivial new features. Some that are worth mentioning:
Introduction of platforms, and notably the
linux
platform that compiles to a statically linked executable
that only uses Linux kernel
syscalls
;
From a practical perspective this is almost equivalent with
running directly on the bare metal (i.e. all dynamically allocated
memory needs to be managed by our own GC, all IO behind our own
abstractions, etc).
The host and the slave are isolated while bootstrapping which makes it possible to do things like reordering types (changing their type id in the target), or changing their object layout.
Relying on this isolation, the code in eval.l
now looks pretty much the same
as something that is meant to be loaded into the evaluator (i.e. the function
implementing car
in eval.l
is now called car
). This paves the way for
metacircularity: to be able to "bring alive" the evaluator by loading it
verbatim into another instance of itself (as opposed to compiling it to
machine code and giving it to a CPU to bring it alive).
The addition of an LLVM backend.
Maru was developed as part of Alan Kay's Fundamentals of New Computing project, by the Viewpoints Research Institute. The goal of the project was to implement an entirely new, self-hosting computing system, with GUI, in 20.000 lines of code.
At some point VPRI went quiet and closed down in 2018. Much of their online content disappeared, and the team (probably) also dissolved.
Their annual reports: 2007, 2008, 2009, 2010, 2011, 2012.
The piumarta
branch of this git repo is a conversion of Ian Piumarta's Mercurial
repo that was once available at
http://piumarta.com/hg/maru/.
To the best of my knowledge
this is the latest publicly available state of Ian's work. This repo was full of
assorted code, probably driving the VPRI demos.
The piumarta
branch will be left stale (modulo small fixes and cleanups).
My plan is to eventually revive most of the goodies from this branch, but in a
more organized and approachable manner, and also paying attention to the
bootstrapping issues.
Ian published another Mercurial repo somewhere halfway in the commit history
with only a couple of commits from around 2011. I assume that it was meant to hold
the minimal/historical version
of Maru that can already self-host. I started out
my work from this minimal repo (hence the divergence between the piumarta
and
the maru.x
branches in this repo).
There are some other copies/versions of Maru. Here are the ones that I know about and contain interesting code:
below-the-top is some kind of generic sexp tokenizer and evaluator written in Common Lisp that can be configured so that it can bootstrap Maru. I haven't tried it myself.
A list of projects that are relevant in this context:
sectorlisp
(github): LISP with GC in
436 bytes. It doesn't have a compiler, i.e. it cannot
self-host. It only has a C implementation, an x86 assembly
impementation (in the form of a boot sector), and John McCarthy's
Lisp in Lisp evaluator. It would be an interesting project to add
a compiler to it and see how the end result compares to Maru. Or
to start growing a language as demonstrated in this repo, but
starting out from sectorlisp. Note that sectorlisp is not
equivalent to the first stage of Maru (the maru.1
git branches),
because that can already self-host, i.e. it can bootstrap itself
off of the C implementation.
Seedling: a ladder of languages, with a minimalistic core language at the bottom called Seed (it's a Forth like). Seed can self-host in less than 1k LoC. The higher level languages above Seed are (going to be) extensions of it, and are implemented on top of Seed. Porting to a new architecture will be trivial. And an interesting tidbit: the initial bootstrap was done not by using another programming language/compiler, but by pen and paper!
bootstrappable.org: a community around bootstrapping, and making/keeping projects bootstrapable. It brings together many interesting projects: stage0 (~500 byte self-hosting hex assembler), live-bootstrap, GNU Mes (Scheme + C, mutually self-hosting each other), m2-planet (a tiny C compiler).
Kalyn: a subset of Haskell semantics (mostly; not lazy), but with Lisp syntax. Entirely (!) self-hosting over x86-64 in 4-5 kLoC. The project feels of high standard, including its documentation.
nanohs: a tiny self-hosting subset of Haskell.
PEG-based tree rewriter: runnable code to accompany Ian Piumarta's paper called PEG-based tree rewriter provides front-, middle- and back-end stages in a simple compiler. Ian wrote this before Maru, and there are several similarities between the two. See the mailing list thread.
blynn's Haskell compiler: bootstrap a Haskell compiler incrementally from C, with extensive documentation..
RefPerSys: a mostly symbolic artificial intelligence long-term project, with ambitious Artificial General Intelligence goals. It contains interesting and relevant ideas, e.g. in refpersys-design.pdf.
Project Oberon: a project which encompasses CPU, language, operating system and user interface, and which can be run on a relatively inexpensive FPGA board, and simple enough for one person to understand it all.
tort: Inspired by Ian Piumarta's idst, maru and other small runtimes. Core is approx. 5000 lines of C.
kernel: "Kernel is a conservative, Scheme-like dialect of Lisp in which everything is a first-class object." (including special forms) You may want to also see this blog.
Compiling a Lisp: Overture: Educational article series about constructing a simple Lisp compiler, implemented in C.