cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.85k stars 3.77k forks source link

gc: latchless garbage collection breaks MVCCStats `KeyBytes` accounting #92117

Open aliher1911 opened 1 year ago

aliher1911 commented 1 year ago

Describe the problem

Latchless garbage collection introduced in 22.2 removes interference between GC and readers and writers by ensuring that readers never read below GC threshold and writers always write above it.

Unfortunately we have another point of contention - MVCCStats. And currently changes to the stats are not commutative. We only count the size of a key once for the whole history (because we want to reflect prefix compression) and count the size of timestamp for every key. To maintain this invariant, when key is added for the first time, its size is added to MVCCStats and when last version is removed by GC (MVCC tombstone was placed and age of tombstone exceeds gc.ttl) then key size is subtracted from MVCCStats.

Since we removed latches from GC requests, they could now be evaluated in parallel to writes for the same key. GC request would iterate key versions and determines that it removed last version of the key. Put request on the other hand still sees previous tombstone and decides that it is not a first key in history. GC request would then decrease MVCCStats values, but Put request would not increment them.

To Reproduce

It is not easy to reproduce in the live system, but test might be possible using TestingKnobs that would pause request executions it correct points. Issue seems like a classic race condition.

Expected behavior A clear and concise description of what you expected to happen.

Environment:

Additional context The effect of incorrect stats is not significant. Since it is quite an unlikely event, difference would not skew things like GC or range split decisions significantly. Consistency checker will flag it, but won't cause nodes to panic.

Jira issue: CRDB-21586

erikgrinaker commented 1 year ago

The preferred solution here is to account for the key contribution for each individual version, rather than once per user key: https://github.com/cockroachdb/cockroach/issues/92254.

erikgrinaker commented 1 year ago

For reference, latchless GC was introduced in https://github.com/cockroachdb/cockroach/pull/83213.