Open hzxa21 opened 5 months ago
cc @Li0k @Little-Wallace
I try to separate each state-table in one group as independent LSM trees in this PR: https://github.com/risingwavelabs/risingwave/pull/16919 and make the performance similar to separate groups. But I think I need to refactor hummock version structure to reduce the complex of selector.
Recently we have had efficiency issues related to compaction strategy. Here are some examples on top of my head:
16965
There are two mechanisms that can potential resolve the above issue but each of them has its own drawback under the current implementation:
table_id = 2
back into a compaction group withtable_ids = [1, 3]
without trigger a heavy full compaction.In short, 1 is more general but can cause permanent overhead while 2 is more flexible but limited under the scope of one single LSM.
Therefore, currently we are exploring an idea to achieve best of both: If the hummock version representation in meta is more flexible instead of grouped by compaction group, we can adopt finer-grain compaction strategy without permanent overhead. The high-level idea is to generalize mechanism 2 to allow compaction strategy (level selector and picker) to dynamically work on different portion of hummock version within the same compaction group independently. Mechanism 1 will then be treated as the last resort and will be triggered less frequently.
The idea is pre-mature and require further discussion on how it works.