Open hzxa21 opened 12 months ago
cc @Li0k @Little-Wallace
We suspect that this problem comes from trivial-move. When the data has been trivial-move from l0 to base level without going through compact, it may cause a large number of small files to be located at the high level (size of level is no enough to trigger the level compaction)
Hence, two pr had been proposed
I think this problem has been partial solved by https://github.com/risingwavelabs/risingwave/pull/12534 because we won't move small sst files to base level in split group.
I have been concerned about the problem of small files in sub level, and small files usually come from two issues:
For issue 2,we are exploring two options to solve it
However, trivial-move sst is a more difficult problem in cg2/cg3. The algorithm of pick_sub_level will have more algorithmic complexity as sst stacks up at the last sub-level. Therefore, I suggest an improved strategy, when there are too many sub-level partitions in the last sub-level (a large number of SSTs have been accumulated), we should allow multiple SSTs to be selected as the base range in each loop. In Sort: when there are too many ssts in last sub-level, no longer use single sst as the base range, but should consider multiple consecutive ssts and use size to limit it.
@hzxa21 @Little-Wallace @zwang28
Under a certain kind of workload, even after compaction, there can be many small SSTs written to the bottom level, each of which only contains one vnode. This can potentially cause several issues:
In practice we have seen more than 10000 small SSTs in level 5 and here are the sst stats: l5..sst.stat.txt l4.sst.stat.txt
Some ideas:
More ideas are welcomed.
Note that we should also reduce the time complexity of the picker algorithm (#10109) but it is independent of this issue.