cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.5k stars 3.7k forks source link

sql: change `system.table_statistics` to honor fine grained data domiciliation rules over indexed data #70915

Open knz opened 2 years ago

knz commented 2 years ago

Describe the problem

The table statistics stored in system.table_statistics contain a copy of table data in the histogram column.

When the data from the table is constrained to certain regions (data domiciliation), the table system.table_statistics is not constrained and thus the data "escapes" the region it should remain at.

This makes it impossible to do strict data sovereignty partitioning using multi-region CockroachDB when domicilied data is indexed. (The issue does not exist when domicilied data is not indexed.)

Note: we already document this limitation in https://www.cockroachlabs.com/docs/stable/data-domiciling.html#limitations

Epic: CRDB-10287

To Reproduce

  1. create a multi-region table
  2. populate indexed data in each region
  3. create statistics for that table
  4. read the content of system.table_statistics and observe data from regions duplicated in the histogram column

Expected behavior

The table system.table_statistics should be split into different ranges so that the stats for each table are stored in the same region as the data that's constrained by zone configs.

Environment:

crdb v21.2

Jira issue: CRDB-10286

knz commented 2 years ago

cc @awoods187 you may want to track this for GDPR compliance

github-actions[bot] commented 10 months ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!