Open ajwerner opened 3 years ago
@ajwerner how much if this still applies as of today, knowing what we know now about how we expect the declarative schema changer to perform?
This applies for the transactional schema change project, but does not apply to the declarative schema change project which precedes it.
Is your feature request related to a problem? Please describe.
The problems in this issue arise primary in the context of online, transactional schema changes.
Today we enforce unique constraints using an index with a special encoding (most of the time, see
UNIQUE WITHOUT INDEX
discussion below). In regular secondary indexes we make sure each key is unique by putting the primary key as the suffix. Primary indexes are interesting in that they have the properties today of both the unique and non-unique index.The wrinkle here comes up in the context of dropping such a constraint. We've learned time and time again that if we're going to avoid problems in schema changes we need to be able to roll them back until they absolutely cannot fail. That means we need to continue to enforce them, on some level, until they commit.
The crux of the problem here is the two version invariant.
It is possible for v1 to still be in use after v2 (which drops a unique constraint) has committed and values which have violate that constraint have been written. We would also really like it to be the case that all of the secondary indexes of the table have the same length for a transaction (what would it even mean if they did not?!). So, that means that the writes under v2 that precede reads by v1 cannot utilize the unique index encoding. One really important thing to note here is that as soon as v1 is over, we can just straight up drop that unique index and from the perspective of v2 that index is
WRITE_ONLY
. In an ideal world, ifv1
observes a unique constraint violation at a timestamp after v2 has committed, it would not observe that violation. There's ample discussion of this complex scenario in the transactional schema change RFC. However, a more likely interim implementation is like to just optionally enforce the unique constraint throughout the duration ofv1
.An important piece of context is that the optimizer knows how to enforce unique constraints utilizing non-unique indexes. Another important piece of context is that we sort of have support for secondary indexes to have columns which are not yet public.
Describe the solution you'd like
We'd like a valid encoding of the secondary index such that both
v1
andv2
can write to the old unique index and such thatv1
can read from that index (though not necessarily enforce uniqueness with it). The proposal here is that we transition that unique index to being a non-unique index but teach the execution layer that the primary key suffix may or may not be present.The biggest challenge here will be the bookkeeping related to keeping referential integrity in sync. Practically speaking, what we'd like to happen is that the
v1
implementation can continue to write using the unique index encoding whereas the reads will need to be tolerant of the suffix. That sounds darn complex. A better approach is to recognize that if thev1
transaction does observe the suffix, then it knows that the schema change committed and that it can (and should) just restart above the timestamp of the write with that suffix. That combined with some code that intercepted that restart error and made sure the lease manager refreshed that descriptor to at least that timestamp would make for really graceful handling of this edge case.What this means for work items:
Additional context
This is related to https://github.com/cockroachdb/cockroach/issues/59149. Both of these issues involve integration points and required changes with query planning and execution to support transaction schema changes.
Jira issue: CRDB-3071