cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.11k stars 3.81k forks source link

catalog/replication: handle big description version changes more safely #130812

Closed fqazi closed 1 month ago

fqazi commented 1 month ago

Presently with the PCR reader catalog, if a schema change happens on the from cluster, we can end up potentially jumping multiple versions of a descriptor on the next update. On a normal SQL node when a schema change happens we step through one version at a time, which in the worst case means that you could end up mixing the previous and next version of a descriptors.

The lease manager guarantees that txn running with older timestamps will only see descriptor versions from around the time frame that they started, by ensure the modification/mvcc timestamp (of the descriptor) and read timestamps align. However, when a new descriptor comes, the range feed used by the lease manager may cause you to pick up new descriptors one at a time, effectively allow windows where your mixing descriptor versions.

To address this, we will intentionally run read catalog queries with an AOST timestamp selected based on when the leasing subsystem is up to date. Additionally, the descriptor collection will assert all timestamps match.

Jira issue: CRDB-42250

Epic CRDB-37521

blathers-crl[bot] commented 1 month ago

Hi @fqazi, please add branch-* labels to identify which branch(es) this C-bug affects.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.