Start thinking about memory models

hawkw commented 9 years ago

The SECD machine itself does not describe any kind of memory model or methods of memory management; it is essentially a CPU architecture. However, some SECD implementations (i.e. EarlGray/SECD) provide an external memory model.

There are a number of reasons why I should consider this as well. Obviously, handling all variables and storage on the SECD machine itself (essentially, creating an electronically impossible machine with registers only) would have some negative performance implications, and while using the $e stack for let bindings works fine, it's hard to handle stuff like globals (set!). Additionally, the R⁶RS describes vector types that (obviously) would be hard to implement without some kind of non-list way of interacting with memory. We can also then replace with the (obviously rather un-performant) linked-list based SECD String type with something saner and more R⁶RS-conformant.

Furthermore, adding external memory makes IO easier, since it allows for memory-mapped IO, which is probably the easiest to implement (although a port type is more standards compliant and would be good to have eventually). Finally, a memory model with some kind of pointer type (which yes, I know standard Scheme doesn't have) would make C FFI (#19) much less painful; if we have a way of dealing with pointers to our 'own' memory, we can use the same primitives to handle pointers to 'real' memory such as those that might be returned by a C function.

There are some challenges here, of course. Memory management is a Big and Hard Problem In Computer Science, and honestly not one which I have a great deal of knowledge and experience with. Scheme is a GC language, so in order to be fully standards compliant, I'd have to implement some form of GC. However, I'm a big fan of alternative memory management methods, such as region-based memory management and Rust's ownership/lifetime system, which can enforce memory safety with less overhead than GC. It would be wickedly cool to write a Lisp implementation that uses some of these concepts, and this is definitely something I'd be willing to break away from the standards for. I honestly don't know if I have enough experience to pull off something like that, though.

hawkw commented 9 years ago

Obviously, none of this stuff will be in the current release cycle; I'm currently still focused on shaking the bugs out of the VM and on compiling as much of Scheme as one can compile without a memory management method.

However, it's deeply important to begin to think about this sort of thing as early as possible, because I want to ensure that the decisions I make now don't hurt later on. This is especially a problem when implementing libraries and systems rather than applications; if people actually end up using Seax, I don't want to make changes that break other people's stuff except when it's absolutely necessary. It's also good to start thinking about this so that I can start doing research and hopefully have some more fully-formed ideas when it comes time to actually tackle these issues.

hawkw commented 9 years ago

We should also note briefly that if by any chance @EarlGray is reading this, his SECD/Scheme implementation is significantly cooler than mine and that I hope I can eventually bash this thing into that level of completeness. :-)

EarlGray commented 9 years ago

Thanks for the kind words!

The term "memory model" is a bit vague for me. My implementation does not expose much details of memory management into Scheme, although it has some introspection capabilities. It does not have usual set! at all (only a wrapper over "mutable" vectors, box-set!), though secd-bind! mutates the global environment frame. My "instruction set" is also inconsistent: INT, SYMBOL and CONS are built-in in the machine, but STR, VECT, ... are handled through native functions. Reference counting proved itself fragile and produces leaks I still has not eliminated (just do (secd 'dump) and examine the memory dump, it will show a few non-free cells with nref=0 ). Vectors were scary for some time, because it was not clear how to track references to their cells, but then I invented the scheme with copying on each vector-ref or vector-set! that keeps cells owned only by the vector itself; it turned out much easier (from memory safety point) than I thought.

I gave some thought to Scheme-level memory management just before the end of active development of SECD, but did not have much insight. I'd like to implement something like PreScheme, but that requires good understanding of type inference I don't have at the moment. Another interesting project related to Scheme typing is https://github.com/samrushing/irken-compiler/ (but, with all my respect to the author and how far his effort has gone, I don't get shoehorning Scheme into C, it better be a proper dynamic language).

The most straightforward way to go is to treat memory as a huge bytevector and expose some form of type casting in a limited scope (module, closure?) that manages memory. Basically, it's C programming in Lisp syntax, pointers do not play well with traditional Scheme. Something like https://github.com/rswier/c4 is easy to implement and JIT-compile.

As for the alternative memory management methods, this may be interesting for you: http://home.pipeline.com/~hbaker1/LinearLisp.html (it's rather crazy to create a copy of value each time it's referenced, I know).

Kudos for using Rust, it's a great language.

hawkw commented 9 years ago

Wow, thanks for the advice! It's great of you to summarize what you've done; I was planning on reading through your source and blog posts and trying to parse out some of the basics of your implementation as part of my research process, but I hadn't really had the time yet.

I have a ways to go before I start really working on memory management all that much, this issue is just kind of hanging here to remind me to think about it. I'll definitely push all the pointers you sent to my reading list; Linear Lisp looks particularly nifty and I will have to pour over that in great detail when I get the chance.

Thanks again!

EarlGray commented 9 years ago

If you need explanations reading my code, I'll be glad to help, just ping me in this Gitter chat : https://gitter.im/EarlGray/SECD

hawkw commented 9 years ago

Thanks! I'll be sure to ask if I happen across anything I don't understand.

hawkw / seax

Start thinking about memory models #76