[DocDB] Multi-tablet fairness issue: Advance read time in reads/writes transparently at the tserver in READ COMMITTED

yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.

https://www.yugabyte.com

Other

8.65k stars 1.04k forks source link

[DocDB] Multi-tablet fairness issue: Advance read time in reads/writes transparently at the tserver in READ COMMITTED #16154

Open robertsami opened 1 year ago

robertsami commented 1 year ago

Jira Link: DB-5588

Description

If a READ COMMITTED request encounters a conflict at the tserver, we return a kConflict error to the client, upon which the client picks a new read time and retries the request. This is undesirable for a couple reasons:

Additional round trip will increase latency
We cannot take advantage of request-level fairness at the tserver under contention, so we may see higher statement-level p99 latency

Instead, we should enable advancing the read time of a request at the tserver in case a conflict is encountered

robertsami commented 1 year ago

A general fix likely depends on https://github.com/yugabyte/yugabyte-db/issues/11573

However, in the special case that a particular statement is served by a single request to a single tablet, we could transparently advance the read time at the tserver. This may be sufficient for use cases involving e.g. single row explicit locks, and may greatly impact workloads with a substantial amount of contention of this form

See: https://github.com/yugabyte/yugabyte-db/issues/16418

robertsami commented 1 year ago

Since this depends on https://github.com/yugabyte/yugabyte-db/issues/11573, which is not planned for any time in the short/medium term, we are opting for a more surgical optimization in https://github.com/yugabyte/yugabyte-db/issues/18055, and moving this item to the backlog for tracking this improvement in the far future