cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.85k stars 3.77k forks source link

kvcoord: DistSender circuit breaker cancellation won't work on local requests #121210

Open erikgrinaker opened 5 months ago

erikgrinaker commented 5 months ago

The DistSender circuit breakers will cancel in-flight requests when the breaker trips, such that they can be retried on a different replica instead of getting stuck. However, context cancellation may be ineffective on requests that are processed locally, since it is not respected by disk IO, syscalls, mutex acquisition, etc.

We should see if there is a way to fix this -- a straightforward but likely too expensive option is to spawn a separate goroutine for the request processing, where the client selects on a result channel along with the context channel. Another option is to use e.g. async IO, or lobby the Go team to add context support for various APIs such as IO and mutexes.

We likely can't do anything about this for 24.1, but I'm marking it as a GA blocker for visibility.

Jira issue: CRDB-37140

Epic CRDB-39897

arulajmani commented 5 months ago

We likely can't do anything about this for 24.1

Removing the GA-blocker label as a result.