facebook / rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
http://rocksdb.org
GNU General Public License v2.0
28.59k stars 6.31k forks source link

Snapshot not working in RocksDB Java #12984

Open vishaltk opened 2 months ago

vishaltk commented 2 months ago

We are using rocksdbjni version 9.0.0. We have a usecase where we need to scan for some keys from one column family say CF-1. Let's say prefix scanning from CF-1 resulted in 10 keys. Now in the same method, we goto another column family CF-2 for fetching the value associated with each keys we fetched earlier.

The problem is while we goto CF-2 with a bunch of keys, some other thread invalidates those keys and removes their entries from CF-2. This is causing us problem. I was hoping to take a snapshot of the db and then do perform the read operations and release the snapshot. Apparently this is not helping. Pseudo code is below

public List getOrderDataForUser(String userId) { Transaction transaction = db.beginTransaction(new WriteOptions()); ThreadLocal threadLocal = new ThreadLocal<>(); threadLocal.set(transaction.getSnapshot()); ReadOptions ro = new ReadOptions(); ro.setSnapshot(threadLocal.get());

         var iterator = transaction.getIterator(ro, columnFamilyHandle);
     List<String> keys = getKeysForPrefix(userId, iterator); // some function that does prefix scanning and returns the matching keys
     //next - go to CF-2 and fetch order data for the keys

List<StoredOrder> orderDataList = transaction.multiGetAsList(defaultReadOptions, cfs, keys)
    .stream()
    .map(this::safeDeserialize)
    .filter(Objects::nonNull)
    .collect(Collectors.toList());
transaction.clearSnapshot();

return orderDataList; }

Expected behavior

snapshot secures a version of db at that point in time, so that deletion by other threads do not disturb the read operations

Actual behavior

snapshot doesn't work. When we goto CF-2 for fetching the value associated with some keys, we notice that some other threads have modified the data

Steps to reproduce the behavior

create a column family add some key value pairs begin a transaction and take a snapshot read value associated with each keys in a loop make another thread delete of the keys

we can see that data in our snapshot is also modified

adamretter commented 2 months ago

@vishaltk It looks to me like you are using different read options objects when accessing the database. If you want to work with the snapshot, you need to consistently use the ReadOptions that has that snapshot.