speedb-io / speedb

A RocksDB compliant high performance scalable embedded key-value store
https://www.speedb.io/
Apache License 2.0
911 stars 71 forks source link

Intra-L0 compaction Improvement #607

Open Yuval-Ariel opened 1 year ago

Yuval-Ariel commented 1 year ago

Problem description

In Leveled compaction, L0 to L1 compactions can suffer from several issues, two of them are:

  1. Unstable performance - In a heavy write workload, large stalls can happen from huge L0->L1 compactions. Intra L0 compaction can cause this since it can create huge L0 files (4-6X memtable size), thus inflating the size of L0. Heavy intra L0 compactions are bad for an additional reason which is preventing other L0-L1 compactions from running simultaneously.
  2. High write-amplification - When L0 contains small files, the compaction with L1 will cause high write amplification. E.g. by a compaction of 5Mb from L0 with 256Mb in L1 (which is the default L1 size). Many use cases can lead to small files in L0 such as: checkpoints, WAL size, WBM, atomic flushes and more.
    • Note: Large files in L0 also means large index and filter blocks for them which cause problems for the block cache.

Abbreviations: ILC - intra L0 compaction

Solution

Use the ILC to alleviate both issues above by: Don't do ILC when its input size is too big (thus eliminating huge L0 files). Perform ILC instead of L0->L1 compaction when the input size is too small.

Expected Goals

Performance stabilization under heavy write workload. Reduce write amplification with small files.

Yuval-Ariel commented 1 year ago

following https://github.com/facebook/rocksdb/pull/10865

erez-speedb commented 1 year ago

Test with small L0 files (write buffer of 2MB) https://admin.speedb.io/performance?items=jPOgG6nkLNeQw3XtnSHb&items=YbxbzNQTsYQKbMHdUn3m&colors=%23F06292&colors=%2300796b

Yuval-Ariel commented 1 year ago

Test with small L0 files (write buffer of 2MB) https://admin.speedb.io/performance?items=jPOgG6nkLNeQw3XtnSHb&items=YbxbzNQTsYQKbMHdUn3m&colors=%23F06292&colors=%2300796b

does it reproduce the problem?

aierui commented 3 months ago

ping. See: https://github.com/facebook/rocksdb/pull/12214

Yuval-Ariel commented 3 months ago

thanks for the update! @aierui