Robert-van-Engelen / tinylisp

Lisp in 99 lines of C and how to write one yourself. Includes 20 Lisp primitives, garbage collection and REPL. Includes tail-call optimized versions for speed and reduced memory use.
BSD 3-Clause "New" or "Revised" License
842 stars 37 forks source link

No macro? #12

Closed mw66 closed 12 months ago

mw66 commented 1 year ago

Although it's mentioned in the readme, this 99 line version of tinylisp does not implemented macro? As the other version does.

Robert-van-Engelen commented 1 year ago

Macros weren't literally part of McCarthy's 1960 paper on LISP, but there are early references to lists with different evaluation modes that resemble macros. Well, LISP can write LISP code anyway, so that's not really a surprise.

IMHO for a basic understanding of Lisp, the implementation of macro support is unnecessary to add to tinylisp.

Sure, they are special and domain-specific languages need them. But otherwise, do we really need them for educational small-scale Lisp things like tinylisp?

The Lisp in 1k Lines of C project does support macros. That could be used as a basis to implement it in tinylisp too and I encourage anyone to give that a try.

mw66 commented 1 year ago

Thanks for the info. Yes I saw macro is in the mini 1k Lisp. Just want to get the readme doc right for this tinylisp.

Another question I want ask is: the NaN boxing trick is OK, but why it's needed here (for education purpose Lisp implementation, this trick I think actually distracted learners from learning Lisp)? Why cannot we just use bit fields or a small struct as tagged union? Also by using NaN boxing, are we actually wasting 13 bits for each double?

Robert-van-Engelen commented 1 year ago

Another question I want ask is: the NaN boxing trick is OK, but why it's needed here (for education purpose Lisp implementation, this trick I think actually distracted learners from learning Lisp)? Why cannot we just use bit fields or a small struct as tagged union? Also by using NaN boxing, are we actually wasting 13 bits for each double?

Lots of modern PL implementations use NaN boxing instead of tagged structs/unions. Nothing gets wasted, because a cell has to be a fixed size anyway, which is the larger size of all possible values it can hold. Adding a tag therefore makes the cell structure larger by a byte, at least, to hold a tag. That also makes addressing less efficient, since cells are no longer guaranteed to be 32 bit aligned (or the tag has to be 32 bits, which wastes more bits).

mw66 commented 1 year ago

I think I understand it now, the benefits of using NaN boxing is that: in a 64-bit double

1) all the doubles are still fully represented 2) use 48 bit as pointer can still address 262,144 GB memory, that is good enough for any computer today 3) only integer's range gets shrunk: from 64-bit to 48-bit (49 with sign bit depends on if we choose to use it), that's good enough for most purpose, and if one really need > 48-bit integer type, then use the fat BigInt (class / struct) then (which will be treated as object pointer type). 4) and if we want we can even directly intern up to 6 chars length string there as scalar value.

(1) and (3) means all the scalar types are stored as scalar, not pointers to some other representations.

That's very compact design indeed.

mw66 commented 1 year ago

Another nitpicking:

currently all the tags literal are defined as 16 bit:

https://github.com/Robert-van-Engelen/tinylisp/blob/b772bed91ad9009bf6806c57cce41c3e2b9ff9cf/src/tinylisp-commented.c#L40

and shifted at runtime when get / set the tag:

https://github.com/Robert-van-Engelen/tinylisp/blob/b772bed91ad9009bf6806c57cce41c3e2b9ff9cf/src/tinylisp-commented.c#L26

https://github.com/Robert-van-Engelen/tinylisp/blob/b772bed91ad9009bf6806c57cce41c3e2b9ff9cf/src/tinylisp-commented.c#L53-L57

If we shift these const tags at defining time as 64-bit (all shifted << 48 at compile time of their current value), this will avoid runtime shifting.

what do you think? or you have a good reason choosing to do it in the current way?

Robert-van-Engelen commented 1 year ago

To extract double NaN tags at runtime you'll need at least to do one of these things:

These operations are equally costly at face value.

To set tags with box(), we can expect the compiler to use pre-shifted tag value when box() is inlined (it will typically be since it is tiny) and when the tag t passed to box() is a constant. Because all box() calls in tinylisp use a constant t, there is no additional cost to use a shift and bitwise OR in box().

Optimizations are best left to a compiler, so we want to leave the code as readable as possiible. That also often helps the compiler to optimize.