cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.03k stars 3.8k forks source link

kv,storage,ui: add better Compaction metrics #46389

Open irfansharif opened 4 years ago

irfansharif commented 4 years ago

...and possibly delete the Compactor Queue as it exists today.

tl;dr: The Compaction Queue chart we expose through our UI is not a very useful chart to be looking at, and we could do better.

The Compactor is the mechanism we have in place today that allows us to suggest compactions, on demand, to the underlying storage engine. We typically make use of this when we know we are generating a lot of garbage (for instance when a store accepts a bunch of new replicas that overlap with existing ones during a decommissioning process). The Compactor periodically goes through received suggestions and instructs RocksDB/Pebble to compact data on disk, as appropriate. Note that this is not strictly necessary, RocksDB/Pebble will carry out compactions over time as needed, the Compactor exists to proactively reclaim space when possible.

The Compaction Queue graph, perhaps confusingly, records the view of the world as seen by the Compactor, not as seen by RocksDB/Pebble. So a "suggestion" to the compactor is recorded in queued bytes (as seen in the UI at the time of writing). It's only when the compactor oversees the processing of what was suggested to it, does it decrement from the queued bytes metric. It does not periodically poll the underlying storage engine to reflect what RocksDB/Pebble thinks this value should be (say, "estimated reclaimable space"), it's only recording the state of the suggestions received thus far. This does not seem to be a useful metric to be tracking. It also only updates the metric on demand when it receives new compaction suggestions. It also does not react to changing cluster settings pertaining to the Compactor (compactor.{max_record_age,threshold_{bytes,{available,used}_fraction}}).

In https://github.com/cockroachlabs/support/issues/385 we observed a supposedly "wedged" compaction graph which was in fact simply out of date, and not updating itself as it hadn't received any compaction suggestions for some time. Because all the suggested compactions were fractured/small, and thus inactionable, the graph persistedly displayed a high queued bytes amount.

For the reasons above, I think what we want is closer to https://github.com/cockroachdb/cockroach/issues/41265 and https://github.com/cockroachdb/cockroach/issues/43965, possibly exposing rocksdb.estimated-pending-compaction as a first class UI citizen instead (and/or the Pebble equivalent). The Compaction Queue graph, as it stands today, offers no visibility into anything we would be interested in (and is also usually of date).


As for the removal of the Compactor Queue in its entirety, I think it was introduced as an attempt to reclaim garbage on demand/control RocksDB compaction behavior, but I'm not sure if (a) we need such a thing, and (b) it's effective at doing said thing. Seems to me if we have problems around garbage reclamation, we should be addressing them at the storage layer, not at KV.

We currently persist received suggestions if we're unable to act on them immediately, in the hope that future suggestions over larger intervals can be merged alongside it. I'm unsure if this happens often, or if it does, when. Suggestions are also deleted after 24hrs (coming back to (b), the effectiveness of it all).

Jira issue: CRDB-5091

irfansharif commented 4 years ago

@petermattis: How much of this work do you reckon falls under storage? We're thinking of this as a potential starter intern project for KV, and the 'ripping out' the compactor queue portion of it all is doable enough, but I'm not sure if we'll end up putting something back in into Pebble. Thoughts?

petermattis commented 4 years ago

I think this almost entirely lands on the storage team. @jbowens is actively doing the work inside of Pebble (which is significant in size), and I was expecting to leave the glory of ripping out the Compactor Queue to him.

jbowens commented 4 years ago

@petermattis Seems like the actual removal of the compactor queue would need to wait until the removal of RocksDB? Is that true?

petermattis commented 4 years ago

@petermattis Seems like the actual removal of the compactor queue would need to wait until the removal of RocksDB? Is that true?

Correct, though we could arrange for the compactor queue to only be enabled for RocksDB. It is probably also worthwhile to get rid of the compactor queue metrics from the admin UI. Those are usually a source of confusion as people conflate them with the Pebble/RocksDB compaction metrics.

petermattis commented 4 years ago

The Compactor queue is disabled for Pebble, and the graphs have been removed regardless of storage engine in 20.2. One bit left to do here is to figure out if we should add additional graphs around compactions. For example, @itsbilal's newly added metric around in-progress compaction disk usage.

mwang1026 commented 2 years ago

@bananabrick this popped on our radar as something you might be interested in taking on / thinking about as a part of the compaction work you have in progress. happy to chat more offline