facebook / rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
http://rocksdb.org
GNU General Public License v2.0
28.67k stars 6.33k forks source link

Poor read performance of RocksDB compared to Jena’s B+ Trees #11767

Open RushikeshHalle opened 1 year ago

RushikeshHalle commented 1 year ago

I am currently working on a benchmark for comparing RocksDB vs Apache Jena’s B+ Tree for storing and retrieving RDF triples. The performance of insertion operation by batching the writes in RocksDB-Java seems pretty good. However, read performance of RocksDB-Java is poorer than Jena's B+ Tree. For example, for a workload of 100M key-value pairs the read performance of Jena B+ Tree is 5 times better than RocksDB-Java. This is the best performance that I was able to achieve after merging multiple .SST files to a single one and disabling compression. Reads with RocksDB-C++ are performing much better than RocksDB-Java, but still less performant than the Jena B+ Tree. I tried a few other tuning parameters, but they are not helping much(benchmark scores). We have also started a similar thread in the TDB3 repository. Any suggestions on how to improve the read performance further?

asad-awadia commented 1 year ago

How are you using jena's btree in your code?

RushikeshHalle commented 1 year ago

@asad-awadia here is the code: https://github.com/TW-Genesis/rocksdb-bench/blob/main/src/main/java/org/example/JenaBPTKVStore.java

Forgot to mention earlier that TDB3 uses RocksDB as a storage engine in the underlying layer.

asad-awadia commented 1 year ago

this.bPlusTree.nonTransactional(); wouldn't this have a major affect on the perf? non transactional would definitely be a lot faster

RushikeshHalle commented 1 year ago

@asad-awadia thanks for the suggestion! I tried the transactional bplustree too. It is not making much difference in the performance results. I have added the results in the benchmark scores sheet for the reference.