cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.13k stars 3.81k forks source link

crosscluster/logical: improve retry behavior during batch ingestion #135272

Open msbutler opened 4 hours ago

msbutler commented 4 hours ago

Currently, if a single kv in a batch loses last write wins or had an unexpected previous value, the whole batch is retried, where each row (i.e. pk kv + secondary index kvs) is committed in their own txn. To improve LDR ingestion throughput, we need to avoid this debatching behavior either by:

  1. Improve heuristics that decide whether to batch a set of kvs. In other words, given a set of kvs, the heuristics can decide not to batch them if the batch is doomed to fail. The code already attempts to prevent patching if it detects duplicate keys in the batch, and conducts a form of backoff to prevent future batching after a debatching event. We suspect these heuristics can only go so far.
  2. Teach the kvTableWriter (and the stack below), to better handle errors within a batch. Here are some possible ideas that build on top of each other: a. If a whole row (i.e. the pk kv and its secondary index kvs) failed, then commit the batch. If a row partially succeeded, then abort the whole batch and use the slow single row batch fallback. If the row fully failed, send the row back to the client to retry. We believe this improved policy will gracefully handle duplicate keys that the rangefeed emits randomaly throughout the replicating key space. b. Don't abort the txn if we observed a partially completed row. Instead, "undo" the successful cputs on the partially successful row, and commit the txn. Retry the unsuccessful rows separately.

Two more ideas discussed that I don't quite fully follow:

Other thoughts in the back of my mind: staged mode stinks.

Jira issue: CRDB-44431

blathers-crl[bot] commented 4 hours ago

cc @cockroachdb/disaster-recovery