cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.16k stars 3.82k forks source link

kv: perturbation test for intent resolution #135937

Open andrewbaptist opened 14 hours ago

andrewbaptist commented 14 hours ago

Is your feature request related to a problem? Please describe. In a customer case, we saw almost 10 billion intents created over multiple ranges over a 5 hour window from a single INSERT INTO ... SELECT FROM statement that was ultimately aborted. We should create a perturbation test that simulates this behavior and validates that whether the intents are cleaned up by either a conflicting higher priority transaction or the mvcc gc queue.

Describe the solution you'd like We have seen large availability outages due to LSM inversion when the mvcc gc queue cleaned up these intents.

Describe alternatives you've considered There is an existing test registerIntentResolutionOverload which tests that the LSM does not get too overloaded. We can replace this test with a perturbation test.

Jira issue: CRDB-44796

kvoli commented 14 hours ago

Removing the O-roachtest label otherwise this ends up in the roachtest triage queue here.