cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.11k stars 3.81k forks source link

workload: handle ambiguous errors #107571

Open renatolabs opened 1 year ago

renatolabs commented 1 year ago

Clients are expected to handle ambiguous errors; for instance, see our own documentation:

Despite that, our very own workload does not handle these errors, leading to occasional roachtest failures that should not have happened.

We want workload to be able to handle these errors whenever possible (retrying when safe to do so). It might not be possible to completely eliminate these errors from bubbling up to the caller every time; each workload should be updated independently.

Jira issue: CRDB-30113

blathers-crl[bot] commented 1 year ago

cc @cockroachdb/test-eng

srosenberg commented 1 year ago

Since this has come up in the context of performance benchmarks, we should be careful not to retry excessively. In some sense, a benchmark becomes tainted since retries could yield to performance degradation.

andrewbaptist commented 2 months ago

At least one type of "incorrect ambiguous" error is described here: https://github.com/cockroachdb/cockroach/issues/129427. The short term fix for that issue is to tolerate errors, but this isn't a great general fix.