Reclaiming Slab Memory - Githubissues

cramertj commented 7 years ago

Currently, tokio-core uses the slab crate to store timer heaps and dispatch maps for IO events and tasks. However, the slab crate currently offers no way to reclaim memory. This means that the size of a reactor::Inner is determined by the maximum number of IO events, tasks, or timers that have existed at any one time. Once that maximum has been raised, the reactor can never return to its original size.

This is a problem when using tokio-core to build long-running infrequently-called systems services which must scale to high loads and then drop back down to low resource usage when inactive. I don't personally have memory usage benchmarks available, but for a quick back-of-the-napkin estimate:

If a user has 100,000 IO events + tasks open at one time, then it's holding memory for 100,000 ScheduledIo/ScheduledTasks. A ScheduledIo is a pointer and two Option<Task>s, and size_of::<Option<Task>> is 80 bytes. 100,000 ((2 80 bytes) + 64 bits) = 16.8 megabytes.

I'm not sure how relevant this is to web-servers and other cloud applications (which are often running on the JVM, which hogs far more than 17 megabytes). However, I'm concerned about the impact of this memory usage on local high-demand services. Has there been any discussion of ways to reclaim memory while using slab, or of another data structure that could be used? The obvious solution is a HashMap, but I'm sure that would result in lost performance.

carllerche commented 7 years ago

The main issue here is not so much going to be the size of memory allocated by the slab, but more so that "live" entries in the slab could be scattered across the memory, keeping the pages alive.

Generally, inactive memory pages will be reclaimed by the OS as needed. Is that a fair assumption?

cramertj commented 7 years ago

@carllerche That seems fair to me, although I'd add that the OS reclamation would be more heavy-weight and less precise than a slab or allocator free.

If we could find a way to keep the active elements isolated to a particular page of memory, that would help a lot (both for memory usage and cache performance).

alexcrichton commented 7 years ago

Thanks for the report @cramertj!

It was my hope that we wouldn't have to worry about this for quite some time... It would be relatively straightforward today to add some shrink_to_fit calls but as @carllerche mentions the slab is proportionally sized by the largest active token, which is itself on the order of the maximum number of concurrent events at once. Even if we were to shrink the slab itself there'd be no guarantee that we could actually reclaim memory.

This will also get significantly more difficult with tokio-rs/tokio-rfcs#3 and the proposed implementation, which has some atomic management of the internal slab and would make reclaiming the memory much more difficult.

carllerche commented 7 years ago

I've been thinking a bunch about this, I think that you could divvy up the memory into regions (where each region is a slab) and free a region when it is unused.

This would mean that there would be a free list per region and regions would be prioritized so that, eventually lower priority regions become unused.

This would work in the atomic case mentioned above and could even provide improved concurrent behavior as regions could be assigned to threads to lower contention.

cramertj commented 7 years ago

@carllerche Yes, I'd been thinking about a similar strategy of generation-based allocation, sort of a slab-of-slabs that would allow you to collect an entire slab once it was emptied.

tokio-rs / tokio-core

Reclaiming Slab Memory #264