cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.21k stars 3.82k forks source link

kv: audit usages of admin split #110934

Closed kvoli closed 1 year ago

kvoli commented 1 year ago

The AdminSplit db API is notoriously poor at ensuring that the provided split key argument is safe to actually split at. The split db command relies on the caller to specify a safe SplitKey argument.

This issue is to audit all usages of the AdminSplit command, and document:

  1. How are the SplitKey arguments derived.
  2. Is the SplitKey guaranteed to be safe?
  3. Are there protections in AdminSplit methods for possibly unsafe SplitKey's noted above, if not - could there be.

Jira issue: CRDB-31679

kvoli commented 1 year ago

Usages to audit are listed below. The usage were gathered by finding references to db.AdminSplit using an LSP, then filtering out test files.

Details - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/ccl/backupccl/generative_split_and_scatter_processor.go#L154-L154 1. Using the `KeyRewriter` when changing the table id. Split key comes from the restore span start key. 2. Ignores the result of `EnsureSafeSplitKey` when returning an error, using the original key. 3. No protections against split in-between SQL column family. - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/ccl/backupccl/restore_job.go#L3303-L3303 context: experimental online restore. 1. Created from either the start key of the file being restored, or the restore span within the file, or the start of the restore span. No rewriting modification. 2. Doesn't alter the key from the restored span, doesn't call `EnsureSafeSplitKey`. The split key is not guaranteed to be safe, however the backup/restore would need to contain unsafe split keys. 3. No protections against split in-between SQL column family.
Details - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/ccl/streamingccl/streamingest/stream_ingestion_dist.go#L255 1. Key is taken from the original cluster (c2c) split point, rekeyed. 2. Doesn't call `EnsureSafeSplitKey`, can replicate bad splits. 3. If the original cluster key were between SQL column families, there's no protection.
Details - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/sql/tenant_creation.go#L237-L237 context: Split point created for a new tenant. 1. Created using the tenant span start key and tenant span end key. https://github.com/cockroachdb/cockroach/blob/13d81cabaee7600315247e4855d503565d2a527b/pkg/sql/catalog/bootstrap/metadata.go#L245-L245 2. Yes, it will split at tenant boundaries. - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/sql/index_split_scatter.go#L61-L61 context: split for index 1. Start key of the index 2. Yes, the split will occur on an index boundary. - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/sql/schema_changer.go#L3176-L3176 1. split and scatter for backfills on sharded indexes 2. Partition key and shard - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/sql/backfill.go#L2262-L2262 context index backfill 1. Index? 2. Yes, the split will occur on an index boundary. - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/sql/split.go#L64-L64 1. Takes a row key 2. Yes, uses a table key to begin with. - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/sql/importer/import_processor_planning.go#L362-L362 context: pre-splits on table boundaries 1. Takes each of the index spans for a table 2. Yes, uses the `table/index` prefix
Details - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/kv/bulk/sst_batcher.go#L601-L601 1. Uses an existing key within an SST. 2. Yes, logs a warning if `EnsureSafeSplitKey` returns an error and doesn't split. - [x] https://github.com/cockroachdb/cockroach/blob/f9ccf8ec425e216852b0988782826df0d88691b1/pkg/kv/bulk/buffering_adder.go#L420-L420 1. Uses a key at (split stride) which has been added to the buffer. 2. Yes, logs a warning if `EnsureSafeSplitKey` returns an error and doesn't split.
kvoli commented 1 year ago

Edited above comment for a first pass at the admin split usages.

kvoli commented 1 year ago

What protections does AdminSplit have when a manual key is given, to protect against the key being a "bad" split which will cause queries to return incorrect data?