cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.96k stars 3.79k forks source link

sql: use the table descriptor ADD state correctly #20714

Open vivekmenezes opened 6 years ago

vivekmenezes commented 6 years ago

A new table descriptor being added when referencing another table has to be added properly. An FK reference is implemented using two references (forward from new table to its FK reference, and backward from the FK referee to the new table). The forward reference is obvious, the backward reference on the other hand is used when updating the referee. Updating a referee table for example must ensure that it is not violating the FK reference. This becomes problematic if a node has cached a referee table without the backward reference and SQL is already putting data into the new table.

We solve this problem by placing a newly created table in the ADD state and then transition it into the PUBLIC state once all leased referee descriptors are using the backward link. A descriptor in the ADD state cannot be used to read/write to and so the table descriptor leasing logic disallows a node from holding a lease against such a table.

However one can have a situation where a referee sees a back reference to a table, the local node has a leased version of the new table in the ADD state, but the new table has transitioned to the PUBLIC state and another node has added data into the new table. Under the circumstances the referee must apply the backward FK checks it would normally apply.

This can be resolved by allowing the new table to be leased in the ADD state but ensure that only fk checks and cascade operations can use the descriptor. This can mean that the normal API still disallow returning descriptors in the ADD state while allowing them to be leased internally, and the FK operations use a separate API that allows grabbing references on leased descriptors in the ADD state.

Jira issue: CRDB-5914

vivekmenezes commented 5 years ago

This problem will be resolved by the new schema lease mechanism which will lease the entire schema at a timestamp, thereby ensuring that a node sees the reference and the back-reference in the PUBLIC state.

ajwerner commented 3 years ago

This seems quite real. See https://github.com/cockroachdb/cockroach/blob/025c53decd8eb00cb70a1d410692026c9f8614dc/pkg/sql/opt/optbuilder/mutation_builder_fk.go#L608-L621

ajwerner commented 3 years ago

My current thinking on how to solve this relates to the fact that we validate cross references at the present snapshot when we perform leasing. That means that when we lease a descriptor, we know the modification timestamps and versions of all of the referenced descriptors. If the leased descriptor tracked this, then we could enforce that if any of those referenced descriptors is leased by a transaction, that it must be the newer one. This might pick up false dependencies, but they should be short-lived.

ajwerner commented 2 years ago

The above analysis is barely relevant to the problem at hand. The above protocol might be useful in some other situations, but here I think we indeed need the optimizer to plan foreign key cascades to ADDing tables in order to avoid missing writes. I think we added the necessary pieces in the catalog to resolve the cascades, so really it seems like fixing this can be done from the optimizer.

ajwerner commented 2 years ago

cc @mgartner