Robert-van-Engelen / tinylisp

Lisp in 99 lines of C and how to write one yourself. Includes 20 Lisp primitives, garbage collection and REPL. Includes tail-call optimized versions for speed and reduced memory use.
BSD 3-Clause "New" or "Revised" License
836 stars 34 forks source link

Tail call optimization does not seem to work? #15

Open tomaskala opened 8 months ago

tomaskala commented 8 months ago

Hello, I've been experimenting with the tail call-optimized version of tinylisp (tinylisp-opt.c), but it doesn't seem to optimize tail calls. Specifically, I've been testing it with a tail call-optimized factorial computation:

(define factorial1
  (lambda (n acc)
    (if (< n 2) acc
      (factorial1 (- n 1) (* n acc)))))

(define factorial
  (lambda (n)
    (factorial1 n 1)))

The following shows a session of me compiling tinylisp-opt.c and running the above factorial function with the value of 1000:

$ gcc -o tinylisp tinylisp-opt.c
$ ./tinylisp
tinylisp
930>(define factorial1
(lambda (n acc)
(if (< n 2) acc
(factorial1 (- n 1) (* n acc)))))
factorial1
871>(define factorial
(lambda (n)
(factorial1 n 1)))
factorial
842>(factorial 1000)
Aborted (core dumped)

I'd have expected the computation to finish successfully and return inf, since the result exceeds the valid double range.

When I compare it with your full LISP implementation which is also tail call-optimized, I'm getting the correct results:

$ gcc -o lisp lisp.c
$ ./lisp
lisp
8020+2000>(define factorial1
?(lambda (n acc)
?(if (< n 2) acc
?(factorial1 (- n 1) (* n acc)))))
factorial1
7976+1996>(define factorial
?(lambda (n)
?(factorial1 n 1)))
factorial
7960+1994>(factorial 1000)
inf

I've even tried to increase the available memory of tinylisp by setting N to larger values, but there is always a value that makes the code run out of memory. Compared with the full LISP implementation which happily calculates (factorial 100000) (though again producing inf, of course), this makes me think that there is something missing from tinylisp-opt.c to make it fully tail call-optimized.

Robert-van-Engelen commented 8 months ago

I think you're right and it is because I never bothered to update tinylisp after writing the other lisp (lisp in 1K lines of C). Those small lisp interpreters do perform full TC optimization:

6262+1924>(factorial 1000000)
inf
6262+1924>

Tinylisp does a bit of TC, but not full TC. What is missing is a TC classification of lisp primitives to TC-optimize them (let, cond, if, begin etc). So the if will not properly tail-call in your factorial1. Note that the other lisp I wrote include TAILCALL designations for several primitives, defined in the function primitives prim[] table. Those return an unevaluated lisp expression that the eval loop then continues to operate on (to avoid a recursive eval call).

We can (should) do the same for tinylisp.

tomaskala commented 8 months ago

The TC classification seems to be in place, though: There is a short t flag for each primitive with 1 denoting tail calls in the prim array, and a conditional to keep evaluating such primitives in the evaluation loop.

Robert-van-Engelen commented 8 months ago

Yes, but it doesn't work the same way as the eval loop in the "big brother" lisp. Perhaps it's not the if that is the culprit here, but the bigger problem is the evaluation of the function arguments that require a binding list. This binding list is not released from memory (tinylisp has a GC when it returns to the prompt). That is the major difference and results in abort when the interpreter runs out of memory. A suggestion would be to use a simple ABC GC, but an ABC GC "as is" is incorrect as you can quite easily see, because it removes the evaluated argument list when this evaluated list may actually need to be (partly) passed back out of a function when it is consumed as a list after a dot in the argument list ("dotted arguments"). For example, lambdas like this can't work with this GC strategy:

(define curry
          (lambda (f x)
              (lambda args
                  (f x . args))))

because args is immediately ABC GCed. Either we can do something simple as ABC or we can't allow dotted arguments. Dotted arguments are quite powerful though.

However, there may be a way to get around this and use a corrected ABC, I'm thinking.

Robert-van-Engelen commented 8 months ago

I've added a clarification of TC optimization limitations to the PDF page 35 related to argument evaluations that will accumulate in memory.

I'm thinking that there should be a way fix "ABC" garbage collection to make it work. But it is not just sufficient for data structures to be acyclic as the SectorLisp author presents it. Acyclic forms like (lambda (x . args) args) (essentially cdr) do not work with "ABC", because the evaluated args is removed from the stack when the closure returns (or tail-calls).

tomaskala commented 8 months ago

Thank you for the clarification! I was actually following along your PDF with my own implementation, and was scratching my head why tail calls do not get optimized. I had made a couple of customizations, but hadn't yet gotten to implementing a full garbage collection, so I wasn't sure if I made a mistake in the evaluator implementation or not. Good to know the underlying reason, thanks a lot!