redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.65k stars 589 forks source link

allow windowed compaction to proceed even if the offset map can't store a full segment #16952

Open andrwng opened 8 months ago

andrwng commented 8 months ago

Version & Environment

Redpanda version: (use rpk version): dev, v23.3

Windowed compaction will build a map of latest offsets per key, and this map is used to deduplicate segments from earliest to newest data. This map can only be used to deduplicate record with an offset equal or lower to the highest in the map. This means that a map corresponding to a partial segment can't be used to deduplicate the latter, unmapped part of the segment.

The way we keep track of where to begin building the offset map to avoid repeat compactions is to update _last_compaction_window_start_offset with the start of the offset map, so that a subsequent compaction can pick up deduplicating from the next highest segment below this point.

When the map is too small to index a single segment, this means that we can't allow compaction to proceed below this point, and compaction gets stuck retrying and failing at the same segment, wasting IO and CPU.

Instead, we should track additional metadata when we've failed to map a single segment, and allow the next compaction to proceed by reading the latter part of the segment.

JIRA Link: CORE-1862

andrwng commented 8 months ago

Just some notes on a couple of instances we've seen here. The log line to watch out for is

...failed to build offset map. Stopping compaction: std::runtime_error

In one case, we're seeing the vectorized_reactor_fstream_reads metric jump up and stay consistently high after the upgrade. Screen Shot 2024-03-08 at 2 14 22 PM

The p50 produce latencies have a noticeable Screen Shot 2024-03-08 at 2 13 45 PM increase after this point.

Observing some other clusters, it doesn't look like windowed compaction always necessarily has this affect, but there are at least a couple cases that have a noticeable bump in produce latency and also have this same failure mode of being unable to build the offset map.

andrwng commented 7 months ago

Marking sev/medium. It doesn't seem like there's danger of cluster instability, but the latency hit can be alarming.

github-actions[bot] commented 1 month ago

This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.

github-actions[bot] commented 1 month ago

This issue was closed due to lack of activity. Feel free to reopen if it's still relevant.