redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.41k stars 577 forks source link

Adjust compaction backlog calculation to account for windowed compaction #15756

Open andrwng opened 8 months ago

andrwng commented 8 months ago

The current compaction controller backlog https://github.com/redpanda-data/redpanda/blob/90598fa8d23847af2f37b7f6ddfdb03f002ba8cb/src/v/storage/disk_log_impl.cc#L2218-L2355 estimates the amount of work compactions will do in order to adjust the priority of the scheduling group and io_priority_class.

This doesn't yet account for windowed compaction, which does more work than self+merge compaction. The result is that it's likely that compaction is likely being scheduled at too high a priority (as it's underestimating the amount of work done per compaction), which may in turn affect e2e latencies, etc.

JIRA Link: CORE-1680

andrwng commented 8 months ago

Removing from the rc5 milestone after some discussion with Michal. I was originally nervous this had contributed to higher latencies in a test cluster, but the underestimate used today means that compactions are run at even lower priority.

github-actions[bot] commented 5 months ago

This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.

piyushredpanda commented 5 months ago

Still needs to be worked on.