moka-rs / moka

A high performance concurrent caching library for Rust
Apache License 2.0
1.56k stars 69 forks source link

Inefficient size aware eviction when inserting new entries with larger size #408

Closed sticnarf closed 5 months ago

sticnarf commented 6 months ago

Example code (LRU policy, moka 0.12.5):

use moka::{future::CacheBuilder, policy::EvictionPolicy};
use rand::prelude::*;

#[tokio::main]
async fn main() {
    let cache = CacheBuilder::<u64, Vec<u8>, _>::new(1 << 24)
        .eviction_policy(EvictionPolicy::lru())
        .weigher(|_, v| v.len() as u32)
        .build();

    // Fill the cache with 512 bytes values
    for i in 0..65536 {
        let mut value = vec![0; 512];
        thread_rng().fill_bytes(&mut value);
        cache.insert(i, value).await;
    }
    println!("{}", cache.weighted_size());

    // Insert much larger values
    for i in 0..4096 {
        let mut value = vec![0; 16384];
        thread_rng().fill_bytes(&mut value);
        cache.insert(65536 + i, value).await;
    }
    println!("{}", cache.weighted_size());

    while cache.weighted_size() > 1 << 24 {
        println!("{} {}", cache.entry_count(), cache.weighted_size());
        cache.run_pending_tasks().await;
    }
}

Output:

16777216
66709504
5300 66709504
4864 67502080
4364 67246080
3864 63307776
3364 55115776
2864 46923776
2364 38731776
1864 30539776
1364 22347776

After inserting 4096 16k-size values, the total weighted size increases and exceeds the 16M capacity.

sticnarf commented 6 months ago

https://github.com/moka-rs/moka/blob/ae33f8db63e3cbb697a307d0068299e7616c7d72/src/future/base_cache.rs#L1515

Using a fixed batch size to evict entries causes the problem.

Is it reasonable to always run the maintenance task when the total weighted size exceeds the capacity (or 10% beyond the capacity)?

tatsuya6502 commented 5 months ago

Hi. Thank you for reporting the issue, and sorry for the late reply. I got COVID-19 for the first time in the middle of March. I was treated for pneumonia in the hospital but had a high fever for 10+ days. I am still recovering, and am slowly getting better.

Anyway. You are right. The eviction batch size can be improved.

Is it reasonable to always run the maintenance task when the total weighted size exceeds the capacity (or 10% beyond the capacity)?

Yeah, that will help. Let me think of some other options too. I will probably disable the batch size limit when cache has no eviction listener given. We use the batch size to avoid to spend too long time on the maintenance task when the user provided a very slow eviction listener closure. Spending too long time on it may block cache writes (insert etc.) because something called "write operation log channel" can get full. So, if no eviction listener is provided, we would not need the limit.

I will start to implement them when I feel better on my health.

tatsuya6502 commented 5 months ago

FYI, I started to work on the fix: