hawkw / sharded-slab

a lock-free concurrent slab (experimental)
MIT License
269 stars 17 forks source link

Add pooling for heap-allocated objects #2

Closed hawkw closed 4 years ago

hawkw commented 4 years ago

some notes on object pooling

requirements

the main use-case for adding pooling is to improve tracing-subscriber's Registry performance by reusing the hashmaps that are used to store user-defined extensions. since sharded_slab is already used for storing per-span data, the ideal solution would be to store those hashmaps in the slab slot for each span, rather than introducing an additional crate for object pooling.

goals

non-goals

hawkw commented 4 years ago

implementation notes

a back-of-the-napkin description of how i expect this will work.

first, we add a trait for data that can be pooled. i proposed this playground in https://github.com/hawkw/sharded-slab/issues/2#issuecomment-592143267, and i imagine something close to that should work — feel free to tweak it as necessary.

each slot in a slab page is represented by the Slot struct: https://github.com/hawkw/sharded-slab/blob/12306dc6c56354e6cb3eea449ab6832d5ed53f8b/src/page/slot.rs#L9-L16 we will want to add an additional type parameter to Slot for a pooled object. the pooled data should live in a CausalCell so loom can enforce that it is not concurrently mutated; we could stick it in the item causal cell, or add a new one. i imagine something like this:

pub(crate) struct Slot<T, C, P = ()> {
    lifecycle: AtomicUsize,
    /// The offset of the next item on the free list.
    next: CausalCell<usize>,
    /// The data stored in the slot.
    item: CausalCell<(Option<T>, P)>,
    _cfg: PhantomData<fn(C)>,
}

when a slot is created, we'll create a new pooled object as well. in theory we could allow customizing the function that constructs the new pooled object, but that's another thing to thread through the whole API, so i think it's okay to just require the pooled objects to implement Default.

when we "remove" a slot, we would want to call the pooled object's Clear method, as well as emptying the Option; this would happen around here: https://github.com/hawkw/sharded-slab/blob/12306dc6c56354e6cb3eea449ab6832d5ed53f8b/src/page/slot.rs#L319

then, we need to thread access to the pooled data through Page, Shard, up to the public API. this is probably not hard, just annoying. then, we write docs for the APIs, and add tests, and we're done!

hawkw commented 4 years ago

in re: my previous comment, it just occurs to me that there is a much simpler solution to providing both the Slab API as it exists currently and adding a pooling API. rather than having a concept of "per slot pooled data" and "non-pooled slot data", we could just implement the Clear trait for Option<T>, like this:

impl<T> Clear for Option<T> {
    fn clear(&mut self) {
         let _ = self.take();
    }
}

then, we can more or less leave the existing internal implementations the same, if we add T: Clear bounds on Shard, Page, and Slot, and we can implement separate Slab<T> and Pool<T> API types like:

pub struct Slab<T, C: cfg::Config = DefaultConfig> {
    shards: Box<[Shard<Option<T>, C>]>,
    _cfg: PhantomData<C>,
}

// ...

pub struct Pool<T: Clear + Default, C: cfg::Config = DefaultConfig> {
    shards: Box<[Shard<T, C>]>,
    _cfg: PhantomData<C>,
}

for the most part, they could expose similar APIs, but I think the Slab::take method https://github.com/hawkw/sharded-slab/blob/12306dc6c56354e6cb3eea449ab6832d5ed53f8b/src/lib.rs#L410-L420 only makes sense for a Slab; the Pool can't really implement this if we want to be able to guarantee that we'll retain allocations in place. The internal Shard/Page/Slot types can expose their take methods in separate impl blocks where the T type parameter is Option<T>.

hawkw commented 4 years ago

@bIgBV since you were interested in working on this, i've left some comments explaining what i have in mind — hopefully, that's enough to get you started! please let me know if you have any more questions!

bIgBV commented 4 years ago

Okay, from reading your notes, I gather that the object pool is separate from the slab and works in conjunction with the slab. It's primarily for heap allocated objects and to help ensure that allocations for the heap allocated objects are reused. This means that when you want to instantiate a heap allocated object, you need to do so through the slab, so that it can keep track of the allocations, correct?

The question now becomes how does work with existing APIs. Slab::insert currently takes ownership of T and Slab::take returns the ownership back, so the caller is responsible for managing heap allocated object.

From what I can see, object pool APIs have you checking out an object, mutating it and using it before checking the object back in. But this usecase is completely different from how the slab works where we insert a value to be shared across multiple threads. I think we could provide an Entry like API where the callee first checks out an entry and modifies it in place. Now this entry's memory is managed by the pool and it's available across threads as it's backed by the Slab

EDIT: I wrote this up before your additional comments outlining a design and implementation. It looks like the design answers the questions I had.

hawkw commented 4 years ago

Okay, from reading your notes, I gather that the object pool is separate from the slab and works in conjunction with the slab. It's primarily for heap allocated objects and to help ensure that allocations for the heap allocated objects are reused. This means that when you want to instantiate a heap allocated object, you need to do so through the slab, so that it can keep track of the allocations, correct?

yup, that's right. i was initially wondering about providing both a storage type that's moved in and out and a pooled allocation in the same Slab type, but i think that's too much effort — users can always implement it with something like this:

struct MyPooledThing {
    heap_data: Vec<u8> // just using this as an example of a pooled heap allocation...
    small_struct: Option<SomeSmallStruct>,
}

impl Clear for MyPooledThing {
    fn clear(&mut self) {
         self.heap_data.clear();
         self.small_struct = None;
    }
}

The question now becomes how does work with existing APIs. Slab::insert currently takes ownership of T and Slab::take returns the ownership back, so the caller is responsible for managing heap allocated object.

yeah, that's right — as i mentioned in https://github.com/hawkw/sharded-slab/issues/2#issuecomment-592150097, i don't think take makes sense for a pool. it could be nice to have an insert API where you get mutable access to the slot before it's made available for other threads to reference, but if we want to punt on that, i think it would also be fine to have a create method or something that just returns an index, and doesn't move anything in; you would then lock the pooled data to modify it.

a way to access the slot mutably would be nicer, though; the issue is just that we would need to make sure any mutation is done before it's made visible to other threads through Pool::get...we could probably take a FnOnce(&mut T) and run it before giving back the slot's index, or have another RAII guard type...

bIgBV commented 4 years ago

a way to access the slot mutably would be nicer, though; the issue is just that we would need to make sure any mutation is done before it's made visible to other threads through Pool::get...we could probably take a FnOnce(&mut T) and run it before giving back the slot's index, or have another RAII guard type...

I think taking a FnOnce(&mut T) an running it in the block when a slot's contents are being re-written should theoretically work. But I think that would be additional work on top of the basic pooling APIs.

bIgBV commented 4 years ago

I've been thinking about a path forward for this. One idea I have is to add a new Slot::get_used_block method which returns a previously allocated and used slot without writing to it. It just sets up all the required state for a slot which has references to it and returns the slot.

We could possibly extract the following snippet into it's own private method.

https://github.com/hawkw/sharded-slab/blob/12306dc6c56354e6cb3eea449ab6832d5ed53f8b/src/page/slot.rs#L162-L164

This method will only be exposed when T: Clear + Default so only the Pool type will have access to it.

We can now expose a new method in Shard Shard::get_allocated which calls Slot::get_used_block if the current page is allocated, if it isn't we return None and the calling function will insert a new T::default which will follow the Pool::create code path.

One concern is that we might have to change Shard::allocate to allow the allocation with an empty T in the slot, instead of just an empty slot.