speedb-io / speedb

A RocksDB compliant high performance scalable embedded key-value store
https://www.speedb.io/
Apache License 2.0
882 stars 65 forks source link

Deprecation of SetupForCompaction causes slow compactions #787

Closed Yuval-Ariel closed 7 months ago

Yuval-Ariel commented 8 months ago

Describe the bug During the rebase on 8.6.7 (https://github.com/speedb-io/speedb/issues/736) we've encountered very long stalls as seen below: image Where purple is the rebased branch and light blue is the current main.

Further investigation showed that the stalls happened since the compaction ran much slower. Running iostat during that time showed that in the rebased version, the reads from disk are consistently at 4kb (rareq-sz) and no reads are queued (rrqm/s). We've tried playing with compaction_readahead_size as written in https://smalldatum.blogspot.com/2023/11/debugging-perf-changes-in-rocksdb-86-on.html but all values showed some degradation.

Release 2.7 of Speedb is based on RocksDB 8.1.1 and there, the compaction reads are using the filesystem defaults ('compaction_readahead_size' is 0 by default) for prefetching which results in much better performance than using any compaction_readahead_size value with the rebased version (rebased on Rocksdb 8.6.7). Some of this change was added in https://github.com/facebook/rocksdb/pull/11631.

The reason for the slower performance of the compaction read speed in the rebased version when using compaction_readahead_size = 0 (using the OS page cache) is because of this PR https://github.com/facebook/rocksdb/pull/11658, which removes hinting the FS with POSIX_FADV_NORMAL for files picked for compaction (which is the default value of access_hint_on_compaction_start). The removal of this Hint results in degradation since these files are already hinted with POSIX_FADV_RANDOM in TableCache::GetTableReader when the files are opened (controlled by flag advise_random_on_open, true by default).

RocksDB plan to deprecate the flag access_hint_on_compaction_start in release 9.0.

To fix the above issue, its been decided that files undergoing compaction will be hinted with POSIX_FADV_NORMAL.

Yuval-Ariel commented 8 months ago

Performance is restored after reverting the code in SetupForCompaction (https://github.com/facebook/rocksdb/pull/11658) Where red is the rebased branch with revert patch and green in current main. image