risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
6.78k stars 561 forks source link

Inefficient compaction strategy when there are more than 2T data in a compaction group with default config #16965

Open hzxa21 opened 3 months ago

hzxa21 commented 3 months ago

By default, the LSM tree of a compaction group can have at most 7 levels (L0 - L6) with base level (i.e. the first non-L0 level) size set to 512MB and level multiplier set to 5x.

There is an implicit assumption under this configuration:

the maximum size for the LSM = 512MB + 512MB * 5 + ... 512MB * (5^5) = ~2TB

Given that the compaction strategy is based on the maximum 2TB LSM assumption, the strategy will be inefficient and pick the wrong level when generating compaction task.

hzxa21 commented 3 months ago

Example collected in a cluster with >2TB compaction group 2: