Last week I had a thought: “What’s the simplest Lisp interpreter I could write, which supports macros?"
A weekend of Clojure hacking and some confusion later, a REPL was born. In this essay, we’ll go through the same journey and make our own Lisp interpreter, which supports macros…in Clojure! Let’s call it, LLisp: a lisp in a lisp.
By the end, you’ll write your own macros in your own darn programming language! Let’s get into it.
Sketch
Okay, let’s sketch out the basic idea behind what we need to do:
A programmer writes some code. So far, it’s just text, and there’s not much we can do with that. We read the programmer’s text, and convert it into data-structures. We can do something with data-structures: we can evaluate data-structures, and return whatever the programmer intended.
If we can convert "(+ 1 1)" into 2, we have ourselves the roots of a programming language.
Reader
Let’s handle “2. Read”. If a programmer writes text like "(+ 1 1)", we want to convert it to data-structures. Something like:
(read-string "(+ 1 1)")
; =>
(+ 1 1)
We could write read-string ourselves. For the simplest case it’s pretty easy [^1]. But, we’re in Clojure after all, and Clojure already understands how to read Lisp code. Let’s just cheat and use Clojure’s edn:
Okay, now to “3. Evaluate”. Remember, our goal is to take data-structures, evaluate **them and return whatever the programmer intended. We can begin with a simple function like this:
(defn eval [form]
;; TODO
)
Evaluating Literals
Let’s get the easy things out of the way. Some things just evaluate to themselves: if a programmer wrote 12 for example, all we’d have to do is to return the number 12. This is the same for strings ("foo"), characters (\b), and booleans (true). They’re literals.
Here’s how we can evaluate literals. Let’s detect them first:
If you haven’t seen some-fn before, this is one of clojure’s many handy and powerful utilities. Now we can handle literals in eval:
(defn eval [form]
(cond
(literal? form) form))
And boom, our programming language starts to do something:
(eval (read-string "12"))
; => 12
Introducing Quote
Let’s get to quote. Quote is a crucial form in Lisp. It lets you write code that does not evaluate. For example:
(quote (+ 1 1))
; =>
(+ 1 1)
Because (+ 1 1) was inside quote, we didn’t evaluate it to return 2. Instead we returned the expression itself. This seems weird if you come from languages that don’t support quotes, but it’s used all the time in Lisp. In fact, there’s a shortcut for it:
'(+ 1 1)
When this is read, it converts to
(quote (+ 1 1))
Thankfully, edn’s read-string does this for us. 🙂
Evaluating Quote
So, how do we evaluate quote? Let’s first create a function that can detect them:
Next, we come to symbols. Symbols are a special Lisp data type. They’re like variables, in the sense that when we evaluate a symbol, we **look it up.
Look it up where? Our interpreter will need some kind of environment, that keeps track of all the variables we defined. If we had an environment, where the symbol fav-num pointed to 41 for example, here’s what evaluating fav-num would look like:
(eval env (read-string "fav-num"))
; =>
41
Evaluating Symbols
Let’s first create this environment. We can use java’s Hashmap to keep a mapping of symbols to their values.
(defn eval-def [env [_ k v]]
(assert (symbol? k) (format "expected k = %s to be a symbol" k))
(.put (:globe env) k (eval env v)))
Here, we know the first argument to def is the symbol, k. The second is the valuev we want to save. So, we update our environment’s globe Hashmap, and point the symbol k, to whatever (eval env v) is. In our case, k would be second-fav, v would be (+ 1 fav-num), and (eval env v) would become 42.
Prettyy cool, this works!
(let [e (env)]
(eval e (read-string "(def second-fav (+ 1 fav-num))"))
(eval e (read-string "second-fav")))
; =>
42
Introducing if
Okay, let’s implement one more special form. We need something to handle conditional logic:
(if (= fav-num 41) 'yess 'noo))
When we get if, we’ll can evaluate the test form: (= fav-num 41). If that’s true, we’ll then evaluate the true case 'yess. Otherwise, we’ll evaluate the false case: 'noo. Sounds like a plan.
Let’s take things up a notch. We have Clojure’s + and so on, but what if our programmer wanted to write their own functions? Here’s an example:
((clo (x) (+ x fav-num) 2))
Here, the programmer wrote a function that takes x and adds fav-num to it.
Say they wrote this. How can we run their function? Well, their definition has an arguments and a body. In this case, it’s (x) and (+ x fav-num) respectively. In this example, the function is called with 2.
Here’s the insight: If we could just evaluate (+ x fav-num), in an environment where x points to 2, voila, it’ll be like we ran the function! Here’s a diagram to explain the idea:
How can we point x to 2? We can’t just update our our globe Hashmap, because then x would be permanently set to 2, even when the body was done evaluating.
So we want a new idea. A scope. We can think of scope as an environment, where the changes we make only last while body is being evaluated. If we can introduce a concept like this, we’ll have functions!
Evaluating Functions
Okay, let’s get this to work. First, let’s make sure that if someone writes (clo (x) (+ x 1)), we don’t actually try to eval-application. Instead, we should treat this as a new closure literal.
Now, these functions are fine and all, but we can do more. Many languages support lexical scoping. For example, we can do this in javascript:
const makeAdder = (n) => (a) => n + a
const addTwo = makeAdder(2);
addTwo(4)
// =>
6
Here, makeAdder returned a function. This function remembered the value of nwhere it was defined. How can we make this work in our own language?
Okay, first things first, let’s say that we let the programmer see the current scope. Here’s an example of what I mean:
((clo (x) scope) 2)
; =>
{x 2}
This special scope symbol, when evaluated returned {x 2}, the current scope!
Nice now, what if we updated our function literal? It could look something like this:
(clo {n 2} (a) (+ n a))
Here, we have a special spot for the scope! When we define the function, if we could just “plop” the scope of wherever it was defined in there, badabing badaboom, we’d support lexical scoping!
When we run (make-adder 2) we’d evaluate (list 'clo scope '(y) '(+ y n)). That would return (clo {n 2} (y) (+ y n))! All of a sudden, we support lexical scoping.
You may think it’s weird that we wrote (list 'clo scope '(y) '(+ y n))) — quote verbose. But, hey, we’ll support macros soon, and that help us write this in a spiffy way. Just you wait 🙂.
Implementing Lexical Scope
Okay. that’s a lot of explaining, but the changes will be simple. First we need to make sure that if we get a scope symbol, we return the current scope:
(defn lookup-symbol [{:keys [globe scope]} sym]
(let [v (some (fn [m] (when (contains? m sym) [(get m sym)]))
[{'scope scope} globe scope])]
(assert v (format "expected value for sym = %s" sym))
(first v)))
And now all we need to do, is to update eval-closure:
Now, we have all we need to implement macros! Let’s think about what macros really are.
Imagine writing the function unless:
(def unless (clo nil (test v)
(if (not test) v)))
Will it work? If we ran (unless (= fav-num 41) 'yes), it would return the symbol yes.
But what if we ran this?
(unless (= fav-num 41) (throw-error))
Since unless is a function, we would evaluate each argument first right? In that case, we’d evaluate (= fav-num 41), which would be true. But, we’d also evaluate (throw-error). That would break our program. This defeats the whole purpose of unless , as (throw-error) was supposed to run only when test was false.
If we ran (unless (= fav-num 41) (throw-error)), here’s what would happen:
The value of test would not be true, it would actually be the list (= fav-num 41). Similarly, we wouldn’t evaluate (throw-error) . v would just be the actual list (throw-error).
The arguments to a macro are not evaluated. When the macro runs, it returns code. The result of
(list 'if (list 'not test) v))
would become
(if (not (= fav-num 41)) (throw-error))
And when we evaluated that, things would work as expected! (throw-error) would never evaluate.
This is the key difference between functions and macros. Functions take evaluated values as arguments and return evaluated values. Macros receive unevaluated arguments, return code, and that code is evaluated.
An important piece to note is that we eval twice. We give the macro the unevaluated arguments, which returns new code, and we once again evaluate that.
So, how can we support this?
Evaluating Macros
So much explanation, for such little code. Are you ready? Let’s add macros.
First, let’s use this new structure for a macro: a macro is a list that begins with mac, and has closure inside:
Oh my god. 3 lines!! We do exactly as we said in our diagram:
We take the “closure” out of our macro, and run it with unevaluated args. We can do that just by wrapping each arg in a quote: (map (partial list 'quote) args).
Once we have the resulting code, we evaluate that again, and boom, we have macros.
One more cool macro. Right now, there’s no real way to define temporary variables. Something like:
(let ((x 1) (y 2))
(+ x y))
; =>
3
How could we support it? Well, what if we rewrote(let …) to:
((fn (x y) (+ x y)) 1 2)
We could just use the argument list of a function to create these variables! Perfecto. Here’s the macro:
(defmacro let (pairs body)
(concat (list (list 'fn (map first pairs) body))
(map second pairs)))
Fin
What a journey. 80 lines, 4000 words, and hopefully a fun time 🙂. You now have a Lisp with macros, and you’ve written some cool ones yourself. Here’s the full source if you’d like to follow along from there.
Credits
The idea to represent macros and closures like this, came from PG’s Bel.
Thanks to Mark Shlick, Daniel Woelfel for reviewing drafts of this essay.
Last week I had a thought: “What’s the simplest Lisp interpreter I could write, which supports macros?"
A weekend of Clojure hacking and some confusion later, a REPL was born. In this essay, we’ll go through the same journey and make our own Lisp interpreter, which supports macros…in Clojure! Let’s call it, LLisp: a lisp in a lisp.
By the end, you’ll write your own macros in your own darn programming language! Let’s get into it.
Sketch
Okay, let’s sketch out the basic idea behind what we need to do:
A programmer writes some code. So far, it’s just text, and there’s not much we can do with that. We read the programmer’s text, and convert it into data-structures. We can do something with data-structures: we can evaluate data-structures, and return whatever the programmer intended.
If we can convert
"(+ 1 1)"
into2
, we have ourselves the roots of a programming language.Reader
Let’s handle “2. Read”. If a programmer writes text like
"(+ 1 1)"
, we want to convert it to data-structures. Something like:We could write
read-string
ourselves. For the simplest case it’s pretty easy [^1]. But, we’re in Clojure after all, and Clojure already understands how to read Lisp code. Let’s just cheat and use Clojure’sedn
:And voila,
read-string
does what we want:Evaluation
Okay, now to “3. Evaluate”. Remember, our goal is to take data-structures, evaluate **them and return whatever the programmer intended. We can begin with a simple function like this:
Evaluating Literals
Let’s get the easy things out of the way. Some things just evaluate to themselves: if a programmer wrote
12
for example, all we’d have to do is to return the number12
. This is the same for strings ("foo"
), characters (\b
), and booleans (true
). They’re literals.Here’s how we can evaluate literals. Let’s detect them first:
If you haven’t seen
some-fn
before, this is one of clojure’s many handy and powerful utilities. Now we can handle literals ineval
:And boom, our programming language starts to do something:
Introducing Quote
Let’s get to
quote
. Quote is a crucial form in Lisp. It lets you write code that does not evaluate. For example:Because
(+ 1 1)
was insidequote
, we didn’t evaluate it to return2
. Instead we returned the expression itself. This seems weird if you come from languages that don’t support quotes, but it’s used all the time in Lisp. In fact, there’s a shortcut for it:When this is read, it converts to
Thankfully, edn’s
read-string
does this for us. 🙂Evaluating Quote
So, how do we evaluate quote? Let’s first create a function that can detect them:
We see if our form is a list, and the first element is the symbol
quote
Works like a charm. Now let’s update
eval
:If our form is a
quote
, we return the “second” part of it. And voila,Introducing Symbols
Next, we come to symbols. Symbols are a special Lisp data type. They’re like variables, in the sense that when we evaluate a symbol, we **look it up.
Look it up where? Our interpreter will need some kind of
environment
, that keeps track of all the variables we defined. If we had an environment, where the symbolfav-num
pointed to41
for example, here’s what evaluatingfav-num
would look like:Evaluating Symbols
Let’s first create this
environment
. We can use java’sHashmap
to keep a mapping of symbols to their values.Let’s import java’s Hashmap:
And make a quick function to create an
env
map:Now, we can accept an
env
map ineval
, and start to handle symbols:Clojure already has a handy
symbol?
function, which we can use. When that’s true, we’lllookup-symbol
. Here’s howlookup-symbol
could look:And with that, symbols work!
Evaluation Primitives
Next up, let’s take an important step to making
(+ 1 1)
work. We’ll want to have+
mean something. What should the symbol+
point too?How about Clojure functions?
If we did that, now whenever we evaluate the symbol
+
, it would return a clojure function:And if we “received” a clojure function in eval, we can just treat it as a literal:
This is one small step in our essay, but a giant step for our Lisp. Every element in
(+ 1 1)
can now be evaluated.Primitive Applications
So how do we run these functions? Well, let’s update
eval
, to handle lists like(+ 1 1)
:Easy. If we can’t figure out what the
form
is, it must be list, and it must mean we want to run it! Here’s howeval-application
could work:In
eval-application
, we evaluate the first part of our list. In our example, this would be the symbol+
, which would return a clojure function.If
f-evaled
is a clojure function (which it is), we would run:eval-many
is just a helper to evaluate a list of forms:args
would be(1 1)
, soeval-many
would return(1 1)
, and that means we’d get the equivalent of:That would return
2
, and babam, we have(+ 1 1)
working!Remember, we evaluate recursively, so we can do some intricate stuff already:
Introducing def
Oky doke, we have an environment, we got symbols that can look stuff up in that environment, and we can even evaluate
(+ 1 (+ 1 fav-num))
.But, how are we going to define variables? For example, let’s say we wanted to have the symbol
second-fav
point to42
.Well, we can introduce a special form. Something like this:
If we receive this
def
form, we’d just update our environment, to point the symbolsecond-fav
to whatever the evaluation of(+ 1 fav-num)
would be.Evaluating def
That sounds like a plan. Let’s detect this form first:
And update
eval
to handle it:And here’s all
eval-def
would need to do:Here, we know the first argument to
def
is the symbol,k
. The second is the valuev
we want to save. So, we update our environment’sglobe
Hashmap, and point the symbolk
, to whatever(eval env v)
is. In our case,k
would besecond-fav
,v
would be(+ 1 fav-num)
, and(eval env v)
would become42
.Prettyy cool, this works!
Introducing if
Okay, let’s implement one more special form. We need something to handle conditional logic:
When we get
if
, we’ll can evaluate the test form:(= fav-num 41)
. If that’s true, we’ll then evaluate the true case'yess
. Otherwise, we’ll evaluate the false case:'noo
. Sounds like a plan.Evaluating if
To implement if, as usual let’s detect it first:
Then use it in
eval
:And here’s what
eval-if
could look like:Just a few lines, following our description to a tee. We see if
evaled-test
is true. Then, we either evaluatewhen-true
, orwhen-false
.While we’re at it, let’s add a bunch of nice primitive functions in our
env
:And bam. Our little interpreter can do quite a lot now!
Introducing Functions
Let’s take things up a notch. We have Clojure’s
+
and so on, but what if our programmer wanted to write their own functions? Here’s an example:Here, the programmer wrote a function that takes
x
and addsfav-num
to it.Say they wrote this. How can we run their function? Well, their definition has an
arguments
and abody
. In this case, it’s(x)
and(+ x fav-num)
respectively. In this example, the function is called with2
.Here’s the insight: If we could just evaluate
(+ x fav-num)
, in an environment wherex
points to2
, voila, it’ll be like we ran the function! Here’s a diagram to explain the idea:How can we point
x
to2
? We can’t just update our ourglobe
Hashmap, because thenx
would be permanently set to2
, even when thebody
was done evaluating.So we want a new idea. A
scope
. We can think ofscope
as an environment, where the changes we make only last whilebody
is being evaluated. If we can introduce a concept like this, we’ll have functions!Evaluating Functions
Okay, let’s get this to work. First, let’s make sure that if someone writes
(clo (x) (+ x 1))
, we don’t actually try toeval-application
. Instead, we should treat this as a newclosure
literal.We can detect it:
And update our
literal?
:Now, if a user just writes a function, it’ll return itself:
Next, let’s update our
environment
, to have a newscope
map:And let’s update
lookup-symbol
, to also look up variables in scope:Closer and closer. Now let’s make it run. If we wrote this:
This list would go to
eval-application
. Let’s update it:f-evaled
would be(clo (x) (+ x fav-num))
. This would mean(closure? f-evaled)
would become true.We’d evaluate every argument with
(eval-many env args)
, which would be(1)
, and give this off toeval-closure
. Here’s howeval-closure
could look:We’d do exactly as we did in the diagram. We’d get
syms
, which would be(x)
, andargs
, which would be(1)
. We’d then create a new scope:That’s
{x 2}
. We’d merge it into our environment, and then evaluatebody
.All of a sudden,
(+ x fav-num)
will become43
!Introducing Lexical Scope
Now, these functions are fine and all, but we can do more. Many languages support lexical scoping. For example, we can do this in javascript:
Here,
makeAdder
returned a function. This function remembered the value ofn
where it was defined. How can we make this work in our own language?Okay, first things first, let’s say that we let the programmer see the current scope. Here’s an example of what I mean:
This special
scope
symbol, when evaluated returned{x 2}
, the current scope!Nice now, what if we updated our function literal? It could look something like this:
Here, we have a special spot for the
scope
! When we define the function, if we could just “plop” the scope of wherever it was defined in there, badabing badaboom, we’d support lexical scoping!Here’s how our
make-adder
could work:When we run
(make-adder 2)
we’d evaluate(list 'clo scope '(y) '(+ y n))
. That would return(clo {n 2} (y) (+ y n))
! All of a sudden, we support lexical scoping.You may think it’s weird that we wrote
(list 'clo scope '(y) '(+ y n)))
— quote verbose. But, hey, we’ll support macros soon, and that help us write this in a spiffy way. Just you wait 🙂.Implementing Lexical Scope
Okay. that’s a lot of explaining, but the changes will be simple. First we need to make sure that if we get a
scope
symbol, we return the current scope:And now all we need to do, is to update
eval-closure
:Here we take the
scope
from the closure itself! With that, lexical scoping works!Introducing Macros
Now, we have all we need to implement macros! Let’s think about what macros really are.
Imagine writing the function
unless
:Will it work? If we ran
(unless (= fav-num 41) 'yes)
, it would return the symbolyes
.But what if we ran this?
Since
unless
is a function, we would evaluate each argument first right? In that case, we’d evaluate(= fav-num 41)
, which would betrue
. But, we’d also evaluate(throw-error)
. That would break our program. This defeats the whole purpose ofunless
, as(throw-error)
was supposed to run only whentest
was false.Now, imagine we wrote the macro unless:
If we ran
(unless (= fav-num 41) (throw-error))
, here’s what would happen:The value of
test
would not betrue
, it would actually be the list(= fav-num 41)
. Similarly, we wouldn’t evaluate(throw-error)
.v
would just be the actual list(throw-error)
.The arguments to a macro are not evaluated. When the macro runs, it returns code. The result of
would become
And when we evaluated that, things would work as expected!
(throw-error)
would never evaluate.This is the key difference between functions and macros. Functions take evaluated values as arguments and return evaluated values. Macros receive unevaluated arguments, return code, and that code is evaluated.
An important piece to note is that we eval twice. We give the macro the unevaluated arguments, which returns new code, and we once again evaluate that.
So, how can we support this?
Evaluating Macros
So much explanation, for such little code. Are you ready? Let’s add macros.
First, let’s use this new structure for a macro: a macro is a list that begins with
mac
, and has closure inside:We can detect macro, and mark it as a literal:
Next, we’ll want to update
eval-application
:Before we always ran
(eval-many env args)
. But this time, if it’s a macro, we pass inargs
directly! That’s the ”code” itself 🙂.And now for
eval-macro
:Oh my god. 3 lines!! We do exactly as we said in our diagram:
We take the “closure” out of our macro, and run it with unevaluated args. We can do that just by wrapping each
arg
in a quote:(map (partial list 'quote) args)
.Once we have the resulting code, we evaluate that again, and boom, we have macros.
Your own Macros
Okay, we have a Lisp that supports macros. Let’s go ahead and write some of our own!
defmacro
Let’s get meta off the bat. Notice what we do when we define a macro:
This whole
(mac (clo nil …))
business is a bit unnecessary. Let’s just write macro that does this for us!This generates the code
(def n (mac (clo nil …)))
. Now we could write:Cool!
fn
Okay, remember how we wrote our function for lexical scoping?
Let’s have a macro write this for us:
Here’s what happens if we wrote:
When the macro expansion for
fn
runs, theargs
would would be(y)
andbody
(+ y n)
. Sowould expand to
and that’s exactly what we wrote by hand! Bam, now we can use
fn
instead of this whole'clo scope
business.defn
Now if we wanted to define functions, we could write:
But we can make it tighter. Let’s add a
defn
macro, so we could write:Easy peasy:
let
One more cool macro. Right now, there’s no real way to define temporary variables. Something like:
How could we support it? Well, what if we rewrote
(let …)
to:We could just use the argument list of a function to create these variables! Perfecto. Here’s the macro:
Fin
What a journey. 80 lines, 4000 words, and hopefully a fun time 🙂. You now have a Lisp with macros, and you’ve written some cool ones yourself. Here’s the full source if you’d like to follow along from there.
Credits
The idea to represent macros and closures like this, came from PG’s Bel.
Thanks to Mark Shlick, Daniel Woelfel for reviewing drafts of this essay.
[^1]: Here’s how Norvig did it for his Lis.py