mun-lang / mun

Source code for the Mun language and runtime.
https://mun-lang.org
Other
1.83k stars 73 forks source link

Implement slab allocator to store `ObjectInfo<T>` #106

Open baszalmstra opened 4 years ago

baszalmstra commented 4 years ago

The MarkSweep struct implements GcRuntime to facilitate garbage collection for Mun. Currently, the implementation is quite inefficient. One of the things that could be improved is to implement a slab allocator (or object pool) instead of randomly allocating memory through a Pin<Box<ObjectType<T>>>. The general idea of the allocator is to allocate large chunks of memory to store instances of ObjectType<T> in and reusing them when they are deallocated. We can use a slab allocator here because, unlike the memory the ObjectType<T> is pointing to, the size of an ObjectType<T> is always the same.

Currently, Mun is single-threaded, however, this will not be the case forever. Therefore it is important to take thread-safety into account when designing the allocator. But probably anything is better than the current implementation.

Good first issue process:

If this is your first PR, welcome 🎉 😄

foeb commented 4 years ago

I'll work on this one :slightly_smiling_face:

Wodann commented 3 years ago

Hi @foeb. There hasn't been any progress on this issue for a while. Do you think you'll be able to finish your work on it?

FreeMasen commented 3 years ago

I am wondering if there would be a way to provide a little more guidance in how to solve this issue. It appears that the changes would be required here, is that correct?

Would the expectation be that a slab allocator be implemented from scratch within the project or would using something like the slab crate be acceptable?

Currently it appears that the project is using the GcPtr struct as an Index into the MarkSweep::objects property, with a slab is the expectation that the same would be true? If so, is there some context on how the current GcPtrs are generated and/or how to avoid stale GcPtrs pointing to the wrong object? If not, what is the expectation for how the MarkSweep would get from a GcPtr to an ObjectInfo<T>?

important to take thread-safety into account

In regards to this statement, it appears that the MarkSweep::objects property is already utilizing an RwLock, does this mean that simply wrapping that in an Arc would be enough?

baszalmstra commented 3 years ago

In Mun, memory on the heap uses a two-step approach: the code holds a pointer to an ObjectInfo, and the ObjectInfo contains a pointer to the allocated data. This way we can easily hot load data because we can change the pointer inside the ObjectInfo. A GcPtr is just a pointer to this ObjectInfo, which should be stable when hot reloading.

We can make use of a slab allocator to allocate the ObjectInfos. This hopefully reduces the fragmentation of memory.

You can use any crate really. We try to keep the number of dependencies low but we're happy to use already existing implementations here.

The objects field is only really necessary because we need to be able to iterate over all allocated ObjectInfos. Using a slab allocator we might be able to get rid of that completely and use the slab allocator for that.

With regards to thread-safety, it would be really nice if we could allocate and deallocate from the slab allocator at the same time. That will make a future multithreaded garbage collector much easier.