facebook / rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
http://rocksdb.org
GNU General Public License v2.0
28.31k stars 6.28k forks source link

Make Universal Compaction More Incremental #8181

Open siying opened 3 years ago

siying commented 3 years ago

In universal, whole levels are compacted together to satisfy two conditions:

  1. total size / bottommost level size > a threshold, or
  2. total number of sorted runs (non-0 levels + L0 files) is within a threshold

For 1, a simple optimization would be to compact so that just enough files are merged into the bottommost level (Lmax) to satisfy condition 1. It would work if we only need to pick some files from Lmax-1, or if it is cheaper over time, we can pick some files from other levels too.

If we finish condition 1, there might be holes in some ranges in older levels. These holes might make it possible that only by compacting some sub ranges, we can fix the LSM-tree for condition 2. RocksDB can take single files into consideration and apply more sophisticated heuristic.

This new approach makes universal compaction closer to leveled compaction. The operation for 1 is closer to how Leveled compaction triggeres Lmax-1 to Lmax compaction. And 2 can potentially be implemented as something similar to level picking in Leveled Compaction. In fact, all those file picking can co-existing in one single compaction style and there isn’t fundamental conflicts to that.

shubhamtomar95 commented 3 years ago

I would like to start with 1. Seems interesting and will be helpful for me to gain more understanding of the code-base.(I'm new to rocksdb project). I've gone through documentation for Universal Compaction Style and skimmed through the db/compaction section of the source code.

Could you please suggest how can I proceed from here.

wqshr12345 commented 1 year ago

Hello, I have learned from here that you are looking for some improvements to Universal Compaction. I have knowledge of and have used Universal Compaction before, and I have also reviewed the source code related to it. Do you have a general direction for the improvements? I would be happy to assist you.