cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.09k stars 3.8k forks source link

sql/backfill: prevent index backfill from evaluating new volatile key columns #81521

Open ajwerner opened 2 years ago

ajwerner commented 2 years ago

Is your feature request related to a problem? Please describe.

Follow-up from https://github.com/cockroachdb/cockroach/pull/81486. The index backfiller can happily evaluate default expressions for new columns which will be part of a new primary index. This is critical to enable https://github.com/cockroachdb/cockroach/issues/47989. However, if the expressions being evaluated are volatile and are part of the key, then restarts will no longer be idempotent and will cause a corrupt index.

Describe the solution you'd like

We should return an assertion failure error if we ever try to execute such an index backfill. We should know the properties of the expressions at backfill time, so it's not hard to add.

Additional context

Relates both to https://github.com/cockroachdb/cockroach/issues/81449 and https://github.com/cockroachdb/cockroach/issues/81448.

When we want to support the following, we'll need to be smarter in how we plan. Namely, we need to first add the column to a primary index but not as a key column, take that all the way to at least validated, then go and build the index with the now materialized k being read from the new source.

CREATE TABLE t (i INT);
ALTER TABLE t ADD COLUMN k UUID DEFAULT (gen_random_uuid()) PRIMARY KEY

Jira issue: CRDB-15105

ajwerner commented 2 years ago

I will fill in more details on how to solve this in code and then pass it along.

ajwerner commented 2 years ago

This issue is just about adding a check that we don't perform dangerous volatile expression evaluation in the index backfiller for key columns. Volatile expression evaluation should only be for stored columns. If we need to backfill a volatile new column into an index key, it must first be rendered into a stored column in some index and then that should be the source for the index backfiller.

postamar commented 1 year ago

The declarative schema changer should reliably evaluate this expression for a stored column so am I correct in understanding that this only concerns the legacy schema changer? In the legacy schema changer ADD COLUMN ... PRIMARY KEY won't work. Is ADD COLUMN ... UNIQUE affected here? I'm trying to think how we're going to test this, or whether to even bother.

Throwing the assertion error seems pretty straightforward, though.

ajwerner commented 1 year ago

This is about adding a safe-guard. This doesn't currently affect anything because we don't support adding a column. I think a unit test in the backfill logic would be straightforward enough.