tsnl / snail-scheme

My Scheme interpreter and compiler.
1 stars 0 forks source link

Garbage collection #4

Open tsnl opened 2 years ago

tsnl commented 2 years ago

All datatypes must offer some way to declare or mark their contents so that a single user-invoked function performs mark and sweep GC.

The user can run this GC function periodically. However, the sole onus of running the GC rests on the user: if an allocation fails, then the runtime crashes. This is not as scary as it seems: the user can garbage-collect extremely conservatively and then relax these constraints if performance is a concern. This encourages awareness of a program's allocation patterns.

This is also necessary because the user may interleave finalization code with a GC invocation, since C++ objects and Scheme objects may transparently intermingle.

tsnl commented 2 years ago
The first step of implementing a GC, redirecting all allocations to an explicit heap, would be a great optimization strategy. We must support multiple heaps: one for loading code objects that persists until the VM is destroyed, a second for general-purpose allocations (including the stack), and possibly more.
tsnl commented 2 years ago

I observed a weird GC bug today, will need to implement tests to investigate more thoroughly. Increasing the size of VCode by adding an std::vector<bool> field resulted in an allocation panic for a particular size-class (20 iirc). Cf bf0e4eaaf6988a52c06784ff8df23138350f2023

tsnl commented 2 years ago

Must implement virtual 'mark' methods for BaseBoxedObject subclasses. Sweep logic is already written. Main decision leading to procrastination is when to trigger collection: if on allocation, then very risky since Scheme allocates frequently. Would like to implement multiple threads first so that (1) TCMalloc rip-off can move pages of memory between threads, and (2) can collect threads one-at-a-time eagerly, ensuring (n-1)/n workers are always servicing jobs (hindered by main-thread restrictions).

Profiling results from demo3.scm, computing (fibonacci 30): operator new invoked by cons when constructing the argument rib r accounts for 46.07% of all execution time.
These should be redirected to a faster, pre-allocated heap that the GC can own.