basilTeam / basil

Fast and flexible language exploring partial evaluation, context-sensitive parsing, and metaprogramming. Compiles JIT or AOT to native code.
BSD 3-Clause "New" or "Revised" License
124 stars 11 forks source link

Tracing garbage collection. #31

Open elucent opened 3 years ago

elucent commented 3 years ago

The langjam build of Basil ruthlessly leaked memory whenever it needed a string or list. This really shouldn't be the case for a proper release...

This issue proposes a simple tracing garbage collector for Basil. The priorities of this design are fairly straightforward:

To this end, I propose we adopt a two-space copying collector, using stack and heap value maps to locate references. Using stack maps allows us to safely ignore non-GC'd values, and using a copying collector allows us to minimize the amount of heap-traversal needed per collection, as well as allowing us to keep allocations fast.

Concurrent and generational GC are probably reasonable to consider for the future, but considering the relatively small number of roots in a typical Basil program currently (should be essentially totally constrained to the stack, and thus proportional to call depth), I'm not expecting a tremendous cost of GC right off the bat. Something simple and workable is all we need for the 1.0 release.

dumblob commented 2 years ago

Reading your recent Twitter posts about copying garbage collection etc. I think you might be interested in https://github.com/mattreecebentley/plf_hive as it preserves reference validity after deletion/reallocation.

elucent commented 2 years ago

The current issue is less of an implementation issue, and more of a compatibility one - the GC itself is mostly done, but precise GC hinges on being able to efficiently find reference maps for heap objects. The plan is to stick this info in the object header like many other GC'd languages do - specifically a four-byte type ID that indexes into a table containing refmap info. The issue this has uncovered is that it's very tricky to reliably deduplicate these IDs (and thus be able to assert equality and such between them) so long as Basil also supports AOT compilation to object file. It'd have to be up to the system linker to fix up the type ID values, which probably isn't even possible and certainly isn't portable.

I've been scratching my head about this for a few days now, the best options so far have been "abandon AOT compilation" and "write an independent linker from scratch". Neither is something I want to take on right now so Basil will probably have a slight hole in its functionality for at least the immediate future...

dumblob commented 2 years ago

Neither is something I want to take on right now so Basil will probably have a slight hole in its functionality for at least the immediate future...

This is a sane decision. It'd be a big shame to "abandon AOT compilation". OTOH "writing an independent linker from scratch" sounds like the way to go (even if it happens rather later than sooner).

Admittedly I was silently expecting since the introduction of compilation to object files that current linkers will not be flexible enough for Basil. I just wasn't able to pinpoint why :smile:. And I guess there are still some undiscovered dark corners lurking around...

elucent commented 2 years ago

Oh, I think there'll end up being a reasonable solution. I just haven't figured it out yet. Perhaps more likely than either of those two options would be some kind of runtime barrier between Basil and C code (disappointing since they should theoretically have the same ABI, but necessary to ensure proper management of the Basil runtime like initialization).

dumblob commented 2 years ago

some kind of runtime barrier between Basil and C code

A runtime linking finalization wrapper?

dumblob commented 2 years ago

Another question :wink:.

Could one somehow assert to the compiler that "here I do not want any allocation nor free to happen"?

If the compiler finds out this requirement is not satisfiable, it'd disallow compilation of the app.