wvwwvwwv / scalable-concurrent-containers

High performance containers and utilities for concurrent and asynchronous programming
Apache License 2.0
285 stars 14 forks source link

HashCache panics on accessing entries after cleanup #121

Closed feymartynov closed 6 months ago

feymartynov commented 6 months ago

Hello! I try to use scc::HashCache to build a rate limiter for a proxy server. There's a number of workers which get and put user rate data into the structure and a background job which periodically cleans up obsolete data alongside with LRU.

However I got panics and made an investigation which lead me to the following minimal example which reproduces the panic:

use scc::HashCache;

const CAPACITY: usize = 64;

fn main() {
    let hc: HashCache<usize, usize> = HashCache::with_capacity(CAPACITY, CAPACITY);

    for i in 0..CAPACITY {
        let _ = hc.get(&i).unwrap_or_else(|| hc.entry(i).or_put(i).1);
    }

    hc.retain(|_, _| false);

    for i in 0..CAPACITY {
        let _ = hc.get(&i).unwrap_or_else(|| hc.entry(i).or_put(i).1);
    }
}

Debug run fails on an assertion:

thread 'main' panicked at /Users/fey/.cargo/registry/src/index.crates.io-6f17d22bba15001f/scc-2.0.9/src/hash_table/bucket.rs:965:9:
assertion `left == right` failed
  left: 0
 right: 2
stack backtrace:
   0: rust_begin_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
   2: core::panicking::assert_failed_inner
   3: core::panicking::assert_failed
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:279:5
   4: scc::hash_table::bucket::Locker<K,scc::hash_table::bucket::Evictable<V>,_>::update_lru_tail
             at /Users/fey/.cargo/registry/src/index.crates.io-6f17d22bba15001f/scc-2.0.9/src/hash_table/bucket.rs:965:9
   5: scc::hash_cache::VacantEntry<K,V,H>::put_entry
             at /Users/fey/.cargo/registry/src/index.crates.io-6f17d22bba15001f/scc-2.0.9/src/hash_cache.rs:1693:9
   6: scc::hash_cache::Entry<K,V,H>::or_put_with_key
             at /Users/fey/.cargo/registry/src/index.crates.io-6f17d22bba15001f/scc-2.0.9/src/hash_cache.rs:1313:17
   7: scc::hash_cache::Entry<K,V,H>::or_put_with
             at /Users/fey/.cargo/registry/src/index.crates.io-6f17d22bba15001f/scc-2.0.9/src/hash_cache.rs:1286:9
   8: scc::hash_cache::Entry<K,V,H>::or_put
             at /Users/fey/.cargo/registry/src/index.crates.io-6f17d22bba15001f/scc-2.0.9/src/hash_cache.rs:1266:9
   9: hash_cache_overload::main::{{closure}}
             at ./examples/hash_cache_overload.rs:15:46

Release build don't always fail but often gives this:

thread 'main' panicked at /Users/fey/.cargo/registry/src/index.crates.io-6f17d22bba15001f/scc-2.0.9/src/hash_table/bucket.rs:964:42:
index out of bounds: the len is 32 but the index is 18446744073709551615
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::panicking::panic_bounds_check
   3: scc::hash_table::bucket::Locker<K,scc::hash_table::bucket::Evictable<V>,_>::update_lru_tail
   4: scc::hash_cache::Entry<K,V,H>::or_put_with_key
   5: hash_cache_overload::main

Rust version: 1.75.0.

wvwwvwwv commented 6 months ago

Thanks a lot! I must have made a rudimentary mistake in the linked list management code. I'll fix it as soon as possible, hopefully by the end of this week.

wvwwvwwv commented 6 months ago

Fortunately, not a complicated problem, so I can upload 2.0.10 within a couple of hours.

wvwwvwwv commented 6 months ago

This issue is fixed in SCC 2.0.10.

Thanks again for finding out this bug!

feymartynov commented 6 months ago

Thanks for the quick fix. The original example now passes but I found a more complex one which still panics with the same backtrace:

use std::sync::Arc;
use std::time::{Duration, Instant};

use scc::HashCache;

const CAPACITY: usize = 4_194_304;

fn main() {
    let start = Instant::now();
    let hc = Arc::new(HashCache::<usize, u64>::with_capacity(CAPACITY, CAPACITY));
    let hc_clone = hc.clone();

    std::thread::spawn(move || loop {
        for key in 0..CAPACITY {
            let _ = hc.get(&key).unwrap_or_else(|| {
                hc.entry(key)
                    .or_put_with(|| {
                        Instant::now()
                            .checked_duration_since(start)
                            .unwrap()
                            .as_secs()
                    })
                    .1
            });
        }
    });

    std::thread::spawn(move || loop {
        let deadline = Instant::now()
            .checked_duration_since(start)
            .unwrap()
            .as_secs()
            .saturating_sub(Duration::from_secs(1).as_secs());

        hc_clone.retain(|_, v| deadline <= *v);
    })
    .join()
    .unwrap();
}
wvwwvwwv commented 6 months ago

Thanks again, I'll look into the remaining issue tomorrow!

-> Now the issue seems to be with put; if my assumption is right, it will take more time to fix than the first issue. -> In addition to this, more test cases are definitely needed.

wvwwvwwv commented 6 months ago

Identified the issue, and fixed it.

SCC 2.0.11 will include the fix.

-> I figured that the HashCache code is not quite optimal, so created #122.