In hummock, compaction can be categorized into two basic types:
trivial task: apply on meta node
normal task: meta and compactor collaboration
The compaction task picks the inputs by the picker algorithm and produces the outputs by meta/compactor operations depending on the type. However, the picker algorithm only picks the ssts based on level and level + 1 (size/count/amplification).
trivial task: Does not limit the write-amplification of the task, since only single-sst is included.
normal task: Reduce write amplification by selecting min-overlap ssts.
Although we try to work out the task with less write amplification by the current input level (Ln, Ln+1), we ignore the effect of sst reaching the output level (future write amplification).
For example:
trivial-task L1 sst[5,6] -> L2 may cause larger write-amplification task (from L2 -> L3)
L1 [5, 6]
l2 [7,8]
l3 [1, 5], [6, 7]
normal-task: L1[4, 6] -> L2 may cause larger write-amplification task (from L2 -> L3)
task 1 will produce sst L2[4, 6, 17, 19] If sst [8, 9] contains a large amount of data in L3, the next compact task will cause larger write amplification
If sst is reorganized into L2 [4, 6] [17, 19], the next compact task will not contain [8, 9] to alleviate write amplification.
L1 [4, 17]
l2 [6, 19]
l3 [1, 5], [8, 9], [17, 19]
Fortunately, due to the characteristics of level compaction, non-L0 sst will only move to the Ln + 1 layer. Therefore, we can judge the impact of SST on future compact task write amplification based on the data distribution of the Ln + 2. And use these characteristics to organize sst and reduce future write amplification.
trivial-task: Disable trivial-move when it would result in a larger write amplification task
normal-task: Based on the current rules, further organize sst according to the information of Ln+2 thereby reducing future write amplification
In hummock, compaction can be categorized into two basic types:
The compaction task picks the inputs by the picker algorithm and produces the outputs by meta/compactor operations depending on the type. However, the picker algorithm only picks the ssts based on level and level + 1 (size/count/amplification).
Although we try to work out the task with less write amplification by the current input level (Ln, Ln+1), we ignore the effect of sst reaching the output level (future write amplification). For example:
Fortunately, due to the characteristics of level compaction, non-L0 sst will only move to the Ln + 1 layer. Therefore, we can judge the impact of SST on future compact task write amplification based on the data distribution of the Ln + 2. And use these characteristics to organize sst and reduce future write amplification.