cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.9k stars 3.78k forks source link

sql: deal with dropping unique indexes in transactional schema changes #61084

Open ajwerner opened 3 years ago

ajwerner commented 3 years ago

Is your feature request related to a problem? Please describe.

The problems in this issue arise primary in the context of online, transactional schema changes.

Today we enforce unique constraints using an index with a special encoding (most of the time, see UNIQUE WITHOUT INDEX discussion below). In regular secondary indexes we make sure each key is unique by putting the primary key as the suffix. Primary indexes are interesting in that they have the properties today of both the unique and non-unique index.

The wrinkle here comes up in the context of dropping such a constraint. We've learned time and time again that if we're going to avoid problems in schema changes we need to be able to roll them back until they absolutely cannot fail. That means we need to continue to enforce them, on some level, until they commit.

The crux of the problem here is the two version invariant.

--time-->
[   v1    ]
    [   v2     ]

It is possible for v1 to still be in use after v2 (which drops a unique constraint) has committed and values which have violate that constraint have been written. We would also really like it to be the case that all of the secondary indexes of the table have the same length for a transaction (what would it even mean if they did not?!). So, that means that the writes under v2 that precede reads by v1 cannot utilize the unique index encoding. One really important thing to note here is that as soon as v1 is over, we can just straight up drop that unique index and from the perspective of v2 that index is WRITE_ONLY. In an ideal world, if v1 observes a unique constraint violation at a timestamp after v2 has committed, it would not observe that violation. There's ample discussion of this complex scenario in the transactional schema change RFC. However, a more likely interim implementation is like to just optionally enforce the unique constraint throughout the duration of v1.

An important piece of context is that the optimizer knows how to enforce unique constraints utilizing non-unique indexes. Another important piece of context is that we sort of have support for secondary indexes to have columns which are not yet public.

Describe the solution you'd like

We'd like a valid encoding of the secondary index such that both v1 and v2 can write to the old unique index and such that v1 can read from that index (though not necessarily enforce uniqueness with it). The proposal here is that we transition that unique index to being a non-unique index but teach the execution layer that the primary key suffix may or may not be present.

The biggest challenge here will be the bookkeeping related to keeping referential integrity in sync. Practically speaking, what we'd like to happen is that the v1 implementation can continue to write using the unique index encoding whereas the reads will need to be tolerant of the suffix. That sounds darn complex. A better approach is to recognize that if the v1 transaction does observe the suffix, then it knows that the schema change committed and that it can (and should) just restart above the timestamp of the write with that suffix. That combined with some code that intercepted that restart error and made sure the lease manager refreshed that descriptor to at least that timestamp would make for really graceful handling of this edge case.

What this means for work items:

Additional context

This is related to https://github.com/cockroachdb/cockroach/issues/59149. Both of these issues involve integration points and required changes with query planning and execution to support transaction schema changes.

Jira issue: CRDB-3071

postamar commented 1 year ago

@ajwerner how much if this still applies as of today, knowing what we know now about how we expect the declarative schema changer to perform?

ajwerner commented 1 year ago

This applies for the transactional schema change project, but does not apply to the declarative schema change project which precedes it.