cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.87k stars 3.77k forks source link

sql: incorrect subzone span indexes with non-unique partition names #128692

Open annrpom opened 1 month ago

annrpom commented 1 month ago

Whenever we generate subzone spans for RBR tables, we assume that partition names are unique in the following logic:

https://github.com/cockroachdb/cockroach/blob/56ce47516c071274e4ffda2cfaddbb798a52ba3e/pkg/sql/partition_utils.go#L97-L105

This part generates 2 maps: one for indexes, one for partitions. The purpose of the map is to be able to take our key (partition name, indexID) and get a value back for the index in the Subzone slice our key should be associated with (in other words which subzone config should this indexID/partition be associated with).

For a regional by row table, the partition names we generate for the regions our rows will be homed in are non-unique. Say we had a RBR table with 2 indexes in a database with a primary region in "us-east-1":

CREATE TABLE tbl2 (
  pk INT PRIMARY KEY,
  i INT,
  INDEX(i)
) LOCALITY REGIONAL BY ROW

In this case, we would have 2 partitions named "us-east-1": one for each index. This would make our generate subzone spans map look something like:

{"us-east-1":1}

Meaning later on during the creation of our subzone spans, we see 2 subzone spans getting created -- both with their indexes pointing to the value in our Subzone slice at index 1.

This implies that any zone configuration changes we make on our IndexID of 1 gets ignored and any zone configuration changes we make on our IndexID of 2 gets applied to IndexID 1.

The solution here seems to be geared more towards ensuring that the partition names we generate for RBR tables are unique: https://github.com/cockroachdb/cockroach/blob/91024190faeb1462e559471bef42ce601fc9bbbd/pkg/sql/region_util.go#L763-L769

Jira issue: CRDB-41152

Epic CRDB-40419

annrpom commented 1 month ago

Also this isn't just for regional by row tables -- since technically a user is able to do the following (but this situation is maybe more unique):

root@127.0.0.1:26257/demoapp/defaultdb> CREATE TABLE t (                                                   
                                     ->     id INT PRIMARY KEY,                                            
                                     ->     c1 INT,                                                        
                                     ->     c2 INT                                                         
                                     -> );                                                                 
CREATE TABLE

Time: 9ms total (execution 9ms / network 0ms)

root@127.0.0.1:26257/demoapp/defaultdb> CREATE INDEX i1 ON t (c1)                                          
                                     ->     PARTITION BY LIST (c1) (                                       
                                     ->         PARTITION p1 VALUES IN (1, 2, 3)                           
                                     ->     );                                                             
CREATE INDEX

Time: 216ms total (execution 216ms / network 0ms)

root@127.0.0.1:26257/demoapp/defaultdb> CREATE INDEX i2 ON t (c2)                                          
                                     ->     PARTITION BY LIST (c2) (                                       
                                     ->         PARTITION p1 VALUES IN (1, 2, 3)                           
                                     ->     );                                                             
CREATE INDEX

Time: 241ms total (execution 241ms / network 0ms)

root@127.0.0.1:26257/demoapp/defaultdb> ALTER PARTITION "p1" OF INDEX t@i1 CONFIGURE ZONE USING            
                                     -> num_replicas = 6;                                                  
CONFIGURE ZONE 1

Time: 11ms total (execution 11ms / network 0ms)

root@127.0.0.1:26257/demoapp/defaultdb> ALTER PARTITION "p1" OF INDEX t@i2 CONFIGURE ZONE USING            
                                     -> num_replicas = 7;                                                  
CONFIGURE ZONE 1

root@127.0.0.1:26257/demoapp/defaultdb> select                                                             
                                     -> jsonb_pretty(crdb_internal.pb_to_json('cockroach.config.zonepb.Zone
                                     -> Config', config))                                                  
                                     ->                                                                    
                                     -> from system.zones where id = 't'::regclass::oid;                   
                     jsonb_pretty
------------------------------------------------------
  {
      "numReplicas": 0,
      "subzoneSpans": [
          {
              "key": "iok=",
              "subzoneIndex": 1
          },
          {
              "key": "ioo=",
              "subzoneIndex": 1
          },
          {
              "key": "ios=",
              "subzoneIndex": 1
          },
          {
              "key": "jIk=",
              "subzoneIndex": 1
          },
          {
              "key": "jIo=",
              "subzoneIndex": 1
          },
          {
              "key": "jIs=",
              "subzoneIndex": 1
          }
      ],
      "subzones": [
          {
              "config": {
                  "inheritedConstraints": true,
                  "inheritedLeasePreferences": true,
                  "numReplicas": 6
              },
              "indexId": 2,
              "partitionName": "p1"
          },
          {
              "config": {
                  "inheritedConstraints": true,
                  "inheritedLeasePreferences": true,
                  "numReplicas": 7
              },
              "indexId": 4,
              "partitionName": "p1"
          }
      ]
  }
(1 row)