jonhoo / bustle

A benchmarking harness for concurrent key-value collections
Apache License 2.0
116 stars 8 forks source link

Get assertion failed #15

Open ibraheemdev opened 4 months ago

ibraheemdev commented 4 months ago

I happened to run into this error running papaya under bustle.

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `false`,
 right: `true`: get(17225765586053087836) 19337850 145067 12727980', /home/ibraheem/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bustle-0.5.1/src/lib.rs:464:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I cannot reproduce it and I have no idea how it occur because it seems to indicate that get returned a value even though remove succeeded on the same thread. I'm not completely familiar with how bustle chooses operations, is it possible that I ran into a rare bug where bustle allowed another thread to insert the same key again?

The workload did not include any updates or upserts, if that simplifies things.

jonhoo commented 4 months ago

Oh weird — that assertion seems legitimate to me, and hasn't changed in the original benchmark. The only change there is https://github.com/efficient/libcuckoo/pull/161, which appears unrelated.

It could of course be a rare bug in the underlying algorithm, hard to say without being able to reproduce.

agourlay commented 1 month ago

Hey :wave:

I am running into a similar error while running bustle against my KV store implementation.

Using the Mix::read_heavy() strategy on a single thread.

The same assertion fails without having any deletes or updates on that particular key.

The comment on the assertion says in the original implementation:

If find_seq is between erase_seq and insert_seq, then it should be in the table.

In my case, find_seq is larger than erase_seq AND larger than insert_seq, however get(finder_seq) does find a value.

This is the output after my improving the assertion output

assertion `left == right` failed: get(3636232156) find_seq:32795260 erase_seq:7826 insert_seq:15653
  left: false
 right: true

If my understanding is correct, this could be explained by the presence of duplicated keys in the vector of generate keys.

That is, the key at index find_seq has been already inserted by a previous operation containing the same value.

The documentation for CollectionHandle::Key says

/// The u64 seeds used to construct Key (through From<u64>) are distinct. /// The returned keys must be as well.

However the code for generating values does not seem to prevent any kind of duplicate.

With additional logging I can see that the keys does contain duplicated values when the assertion fails.

I have also been able to remove the error by cleaning the keys with various tricks like:

      generators.push(std::thread::spawn(move || {
                let mut rng: rand::rngs::SmallRng = rand::SeedableRng::from_seed(thread_seed);
                let mut keys: Vec<<T::Handle as CollectionHandle>::Key> =
                    Vec::with_capacity(insert_keys_per_thread as usize);
                 keys.extend((0..insert_keys_per_thread).map(|_| rng.next_u32().into()));
+               keys.sort();
+               keys.dedup();
+               keys.shuffle(&mut rng);
                keys
            }));

WDYT?

jonhoo commented 1 month ago

Oh, interesting — good catch! Yes indeed, I think you're right that especially for large numbers of keys, we need to actually enforce no duplicates. Would love a PR.