moka-rs / moka

A high performance concurrent caching library for Rust
Apache License 2.0
1.62k stars 73 forks source link

Try `seize` as a replacement of `crossbeam-epoch`? #385

Open tatsuya6502 opened 9 months ago

tatsuya6502 commented 9 months ago

Try seize crate as a replacement of crossbeam-epoch crate.

moka cache uses crossbeam-epoch in its lock-free concurrent hash table called cht. crossbeam-epoch is an epoch-based garbage collector and helps to implement high-performance concurrent data structures like cht. However, we have been struggling for years with the spec of crossbeam-epoch, which does not guarantee to run the destructor of the pointed objects.

https://docs.rs/crossbeam-epoch/0.9.18/crossbeam_epoch/struct.Guard.html#method.defer_destroy

There is no guarantee when exactly the destructor will be executed. The only guarantee is that it won’t be executed until all currently pinned threads get unpinned. In theory, the destructor might never run, but the epoch-based garbage collection will make an effort to execute it reasonably soon.

Many of moka cache users expect the cached entries to be dropped as soon as they are evicted from the cache, of course. However, this crossbeam-epoch spec causes these evicted entries to be dropped with some delays or never dropped.

We have added mitigations for it (notes) and they mostly work except performance degradation. However, there are still some corner cases which seems difficult to solve/mitigate. It might be the time to try alternatives like seize. seize is based on Hyaline reclamation scheme and used in a concurrent hash table called flurry.

tatsuya6502 commented 9 months ago

Unlike crossbeam-epoch, seize provides the followings:

  1. Configurable batch_size: Collector::batch_size
    • This controls the tradeoff between throughput and memory efficiency.
    • Maybe moka cache should allow user to change this value when creating an instance of cache.
  2. Protection against stalled threads: Collector::epoch_frequency
    • This ensures evicted cache entries to be eventually dropped.
  3. No global collector:
    • Each moka cache instance will have a dedicated seize collector.
    • This ensures all cached entries to be dropped when the cache instance is dropped.
tatsuya6502 commented 9 months ago

I am doing some experiments in a separate repository.

tatsuya6502 commented 9 months ago

Memo:

Ideally, I want a concurrent memory reclamation library to provide the followings:

  1. Give us a control whether this differed destruction request should be buffered in a thread local storage or not.
    • We want some of them not to be buffered, while others to be buffered.
  2. Provide an API to trigger a GC.
  3. When a GC runs, pull differed destruction requests from the local buffers of all registered threads,
    • instead of letting each threads to push them to the global queue.
ibraheemdev commented 4 months ago

Hi! I happened to come across this issue and can share some details about seize.

One of the biggest benefits of seize over crossbeam-epoch is predictable latency. Epoch based reclamation has the issue that you have to check whether it is safe to free objects every once in a while when pinning/unpinning a threads, and this check is very expensive as you have to access shared memory from other threads. This means you run into a fundamental tradeoff between latency and memory efficiency. Seize on the other hand only performs this check once when retiring a batch (by pushing the objects to other threads), and uses reference counting to free memory.

This is an important consideration for a cache as you want an optimal balance between performance and memory usage.

Give us a control whether this differed destruction request should be buffered in a thread local storage or not.

This is something I've been thinking about. The problem is that seize requires at least as many objects as the number of active threads to be able to retire a batch, as each object acts as a node in a local thread linked list. Skipping the thread-local buffer would require allocating a large chunk of memory to perform the retirement process, which may be confusing.

What you can do is try to reclaim by calling Guard::flush after large objects are retired.

ibraheemdev commented 3 days ago

I also recently released papaya, a hash table built on seize. I'd be interested in porting moka to use papaya and see the results, I'd expect significant performance improvements compared to the internal cht, especially for readers. Is that something you'd be open to?

tatsuya6502 commented 3 days ago

@ibraheemdev

Is that something you'd be open to?

Thanks. Yes, I am open to it. I was not aware of papaya, but at first glance, it sounds like a good idea to use it in moka!