sin-ack / zigself

An implementation of the Self programming language in Zig
GNU General Public License v3.0
161 stars 5 forks source link

Handle area never stops growing #1

Closed sin-ack closed 2 years ago

sin-ack commented 2 years ago

Problem

Currently, we use a "handle area" to hold handles to tracked values. This handle area is an arena allocator that dispenses handles. This works perfectly fine for tracking, as the handles never move in memory and allocating them is pretty fast (much faster than individual allocations); however, it comes with one significant downside: because we use an arena allocator, we can never reclaim memory where handles no longer exist. Therefore, the handle area keeps indefinitely growing as Self programs run. This is a problem that will make longer-running programs infeasible.

Potential solutions

  1. Implement a custom arena allocator which allows us to check each memory chunk for containing live handles, and removing memory chunks which don't contain any. This would be an O(nk) operation where n = memory chunk count and k = handle count.
  2. Implement a "handle mark". This is a concept from the Hotspot VM. It would allow us to set up "checkpoints" where we restore the previous state of the handle area after exiting a scope (via defer), allowing us to remove chunks which we are sure no longer contain any live handles. This approach has its own drawbacks; if we ever store a tracked value somewhere while holding a handle mark, we will end up pointing to invalid memory. Therefore it cannot be used wholesale in the interpreter.
topolarity commented 2 years ago

Just to add a couple more thoughts from the discussion:

sin-ack commented 2 years ago

It might be worth compacting the arena manually by copying the full set of live handles to a brand new arena.

That makes sense, but the problem is that we also hold references to the handles from the Heap.Tracked objects, so live handles must never move. We will always have at least a couple handles somewhere so this isn't really feasible.

You could invert the control in scheme (1), so that the custom arena allocator keeps track of the number of bytes allocated in each memory chunk. If you use self-aligned, power-of-two size chunks, pointer arithmetic can shave off the offset bits within the chunk to quickly get to the allocated byte counter

I don't quite understand what you mean by this. Can you elaborate?

topolarity commented 2 years ago

I don't quite understand what you mean by this. Can you elaborate?

Absolutely:

If chunks are self-aligned, power-of-two sizes then they are always allocated at addresses that look like 0xXXXXX000, with some fixed number of zero bits at the end. Given the address of an allocation, free() can zero those "offset" bits to get the address of the chunk containing the allocation.

Now that you know the address of the chunk, you can update metadata associated with it. In this case, I was imagining that you'd have a "bytes_allocated" counter per-chunk. free() would prune the chunk when this counter hits zero. You'd still be vulnerable to fragmentation, but I don't think that can be avoided unless things can be compacted/moved in memory.

sin-ack commented 2 years ago

If chunks are self-aligned, power-of-two sizes then they are always allocated at addresses that look like 0xXXXXX000, with some fixed number of zero bits at the end. Given the address of an allocation, free() can zero those "offset" bits to get the address of the chunk containing the allocation.

That sounds pretty cool! Thanks for proposing this. It will probably be what I end up implementing. By self aligned, you mean chunk size == chunk alignment, right?

topolarity commented 2 years ago

Yep, exactly :+1:

I've only experimented lightly with highly-aligned data structures like that before, so I'm eager to hear if you hit any surprises. Good luck! :-)