risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
7.06k stars 581 forks source link

Enhance(compaction): Improving the efficiency of the trivial-move task #19530

Open Li0k opened 14 hours ago

Li0k commented 14 hours ago

https://grafana.test.risingwave-cloud.xyz/d/EpkBw5W4k/risingwave-dev-dashboard?from=1732025046079&orgId=1&to=1732031886092&var-component=All&var-datasource=cdtasocg64074c&var-instance=risingwave&var-namespace=rwc-g1id2bbsgif4fou057af8nafim-li0k-test2&var-pod=All&var-table=All

Recently, in my huge ckpt size test, I found that append only table triggers a huge number of trivial-move tasks.

However, hummock currently only supports single sst trivial-move task. Even though, for trivial-move task we did some commit optimization (merge up to 256 trivial-move task commits), it may still trigger a bottleneck in the meta for large number of ssts.

image

image

image

Therefore, we can support multi sst trival-move tasks to optimize this performance issue.

There are two Pickers in Hummock that can generate trival-move tasks

  1. consider extending the single-file algorithm to multi-file, but this may be more computationally intensive.
  2. Allow batch commit parameters to be changed at runtime to improve commit efficiency.