moka-rs / moka

A high performance concurrent caching library for Rust
Apache License 2.0
1.59k stars 72 forks source link

Can the eviction strategy be configured? #247

Open Millione opened 1 year ago

Millione commented 1 year ago

Right now the eviction strategy is LRU, but I want to use SampledLFU strategy to suit my usage scenario, how could I get there?

tatsuya6502 commented 1 year ago

Hi. Moka does not support other eviction strategies like SampledLFU. I do not have a plan to support it/them as the current strategy seems to give good overall performance. But nothing stops someone from implementing other strategies.

To implement it, you will start to look at the admit associate function of sync_base::base_cache::Inner.

https://github.com/moka-rs/moka/blob/83ce7451c99d57295cd4ba1392890bb7c6765907/src/sync_base/base_cache.rs#L1448-L1525

Currently, it gets the LRU entry at line 1480 and 1488. You will need to change it to sample some entries and then get the LFU of them.

To sample entries and find the LFU, you will do the followings:

  1. Get the number of segments of the internal concurrent hash table (cht) by calling the num_cht_segments method.
  2. Get the keys of the entries in a cht segment by calling keys method.
  3. Randomly sample keys and pick a victim entry with the lowest estimated popularity:
    1. Randomly pick a key from the keys.
    2. Get the entry for the key by calling the get_key_value_and_then method of sync_base::base_cache::Inner.
    3. Check the entry's is_admitted().
    4. If it is true:
      1. Increment the number of sampled entries by 1.
      2. Get the estimated popularity by calling freq.frequency(hash_of_the_key) method.
      3. If the popularity is lower than the current lowest popularity, update the lowest popularity and the victim entry.
    5. If not enough entries are sampled, repeat 3.

I hope this helps.

Millione commented 1 year ago

Really appreciate it, it definitely helps, I will try it.

Also, the background here is I have a go service using ristretto to cache, when I rewrote it with rust using Moka with the same configuration, the cache hit rate dropped from about 90% to 70%.

tatsuya6502 commented 1 year ago

I think there is another difference between Ristretto and Moka/Caffeine. Ristretto has larger frequency sketch than Moka.

https://github.com/dgraph-io/ristretto#features

Admission: TinyLFU - extra performance with little memory overhead (12 bits per counter).

Moka's frequency sketch (aka CountMin sketch) is direct port from Caffeine, and it has 4 bits per counter.

https://github.com/moka-rs/moka/blob/83ce7451c99d57295cd4ba1392890bb7c6765907/src/common/frequency_sketch.rs#L14-L18

A 12-bit counter can count up to 4095, while a 4-bit counter can count up to 15. You might want to change Moka to use 12-bit counters as well to get a similar cache hit performance to Ristretto.

Millione commented 1 year ago

A 12-bit counter can count up to 4095, while a 4-bit counter can count up to 15. You might want to change Moka to use 12-bit counters as well to get a similar cache hit performance to Ristretto.

Thanks, I'd like to try it. Is there any cache hit bench for Moka that I can reuse to compare with different bit counters, now I implemented 16-bit counters.

tatsuya6502 commented 1 year ago

Is there any cache hit bench for Moka ...

Yes, there are two.

Mokabench will be easier to run. It is written entirely in Rust. Follow the README and download the trace datasets called S3.lis and DS1.lis into its datasets directory. And then, run it with the following commands:

## Run with the S3 dataset.
$ cargo run --release -- --num-clients 1

## Run with the DS1 dataset.
$ cargo run --release -- --num-clients 1 --trace-file ds1

(Do not use OLTP.lis dataset. It is a very small dataset and the Moka result will be skewed because Moka has a huge write buffer and the most of the part of the dataset will fit into it)

Another benchmarking tool is Moka Cache Driver for the Caffeine Simulator. It is written in Rust and Java, so it will need more work to run. But a nice thing with it is that you can compare results with other caching strategies. I used it for creating the benchmark results on our Wiki.

Millione commented 1 year ago

A 12-bit counter can count up to 4095, while a 4-bit counter can count up to 15. You might want to change Moka to use 12-bit counters as well to get a similar cache hit performance to Ristretto.

I tried it, but still the same as before, the cache hit rate first goes up to 90% and stays there for a while, then goes down.

commit: feat: change frequency counter bits from 4 to 16

tatsuya6502 commented 1 year ago

A 12-bit counter can count up to 4095, while a 4-bit counter can count up to 15.

@Millione

Nice catch! https://github.com/dgraph-io/ristretto/issues/335

It seems Ristretto uses 4-bit counter too. The max value is 15 and 0x0f (0b00001111) is a 4-bit mask.

https://github.com/dgraph-io/ristretto/blob/3177d9c9520c37d36b18113be01fea6393f63860/sketch.go#L114-L119

    // Counter value.
    v := (r[i] >> s) & 0x0f
    // Only increment if not max value (overflow wrap is bad for LFU).
    if v < 15 {
        r[i] += 1 << s
    }
ben-manes commented 1 year ago

@Millione you might also consider recording a trace (e.g. log a 64-bit hash of the key). Then you can run it against the different simulators that the projects provide and assert behavior, e.g. if any are not honoring the maximum size or misreporting the hit rate, and see what other policies report (like optimal). A trace of a few million events is typically pretty good; you want something many multiples of the cache size to ensure the run captures long-term the workfload characteristics.

tatsuya6502 commented 8 months ago

Hi @Millione,

Also, the background here is I have a go service using ristretto to cache, when I rewrote it with rust using Moka with the same configuration, the cache hit rate dropped from about 90% to 70%.

FYI, you might be affected by the issue fixed by #348 for v0.12.2. The Sentry team reported us that #348 fixed a long standing issue they were observing where the cache hit rate would slowly degrade over time.

Ten0 commented 4 months ago

Is this issue solved? The doc currently mentions TinyLFU as an possible eviction strategies in addition to LRU.

Right now the eviction strategy is LRU, but I want to use SampledLFU

Hi. Moka does not support other eviction strategies like SampledLFU. I do not have a plan to support it/them

image

tatsuya6502 commented 4 months ago

Hi @Ten0

As for the SampledLFU eviction strategy, no, it is not implemented.

Moka's default cache policy remains Caffeine style TinyLFU, which uses an access-order (LRU) queue for selecting a cache victim. (Cache victim is a candidate to be removed from the cache). And this issue is about adding Ristretto style TinyLFU, which uses random sampling for selecting a cache victim (thus SampledLFU).

I opened #389. I will evaluate cache hit performance of Ristretto style TinyLFU to see if it is worth to be implemented.

As for the original issue that the user who created this GH Issue was having with moka@#0.10.1 (see the following quote), we do not know if it has been resolved. But it is possible that it has been resolved.

https://github.com/moka-rs/moka/issues/247#issuecomment-1489588222

the background here is I have a go service using ristretto to cache, when I rewrote it with rust using Moka with the same configuration, the cache hit rate dropped from about 90% to 70%.

We fixed a bug in moka@v0.12.2, which solved a long-standing hit rate issue for other user (below). It is possible that the user who created this GH Issue was affected by the bug.

https://github.com/moka-rs/moka/pull/348#issuecomment-1905985696

FYI, I believe this fixed a long standing issue we were observing where the cache hit rate would slowly degrade over time after ~1 week of operation.

Updating to 0.12.2, the hit rate has been stable without any degradation for 2+ weeks: