vitessio / vitess

Vitess is a database clustering system for horizontal scaling of MySQL.
http://vitess.io
Apache License 2.0
18.67k stars 2.1k forks source link

VReplication: Support automatically replacing auto_inc cols with sequences #16860

Closed mattlord closed 1 month ago

mattlord commented 1 month ago

Description

Since https://github.com/vitessio/vitess/pull/15679 we automatically strip any MySQL auto_increment clauses on tables that are being moved (using the MoveTables command) from an unsharded keyspace to a sharded one. And since https://github.com/vitessio/vitess/pull/13656 we support initializing any sequences in the target keyspace on SwitchTraffic.

In this PR, we build upon both of those items of work to support automatically creating a Vitess Sequence for each if those tables that HAD an auto_increment clause if one does not already exist. This requires the use of two flags:

  1. When creating the workflow, specify:
    • That we want to replace any removed MySQL auto_increment clauses with vschema AutoIncrement definitions using --sharded-auto-increment-handling=replace
    • An unsharded keyspace in --global-keyspace that we can then later use to create the sequence tables in if they don't already exist (they don't in the manual test below)
  2. When switching traffic, specify the --initialize-target-sequences flag, which will create a missing sequence table when possible

Manual test

alias vtctldclient='command vtctldclient --server=localhost:15999'

./101_initial_cluster.sh; mysql < ../common/insert_commerce_data.sql; ./201_customer_tablets.sh; ./202_move_tables.sh; ./203_switch_reads.sh; ./204_switch_writes.sh; ./205_clean_commerce.sh; ./301_customer_sharded.sh; ./302_new_shards.sh; ./303_reshard.sh; ./304_switch_reads.sh; ./305_switch_writes.sh; ./306_down_shard_0.sh; ./307_delete_shard_0.sh

vtctldclient ApplySchema --sql "alter table product drop primary key, add product_id int unsigned not null auto_increment primary key" commerce

vtctldclient ApplyVSchema --vschema='
{
  "sharded": true,
  "vindexes": {
    "hash": {
      "type": "hash"
    },
    "unicode_loose_xxhash": {
      "type": "unicode_loose_xxhash"
    }
  },
  "tables": {
    "corder": {
      "column_vindexes": [
        {
          "column": "customer_id",
          "name": "hash"
        }
      ],
      "auto_increment": {
        "column": "order_id",
        "sequence": "order_seq"
      }
    },
    "customer": {
      "column_vindexes": [
        {
          "column": "customer_id",
          "name": "hash"
        }
      ],
      "auto_increment": {
        "column": "customer_id",
        "sequence": "customer_seq"
      }
    },
    "product": {
      "column_vindexes": [
        {
          "column": "sku",
          "name": "unicode_loose_xxhash"
        }
      ]
    }
  }
}
' customer

sleep 5

vtctldclient MoveTables --workflow commerce2customer --target-keyspace customer create --source-keyspace commerce --tables product --sharded-auto-increment-handling=replace --global-keyspace commerce

vtctldclient GetVSchema customer --compact

mysql commerce -e "insert into product (sku, description, price) values ('SKU-1003', 'Mouse', 5)"

sleep 5

vtctldclient MoveTables --workflow commerce2customer --target-keyspace customer switchtraffic --initialize-target-sequences

vtctldclient GetVSchema customer --compact

mysql commerce -e "show tables"

mysql commerce -e "select * from product order by product_id"

mysql commerce -e "select * from product_seq"

mysql commerce -e "insert into product (sku, description, price) values ('SKU-1004', 'Table', 199), ('SKU-1005', 'Chair', 99)"

mysql commerce -e "select * from product order by product_id"

mysql commerce -e "select * from product_seq"

Results:

{
  "sharded": true,
  "vindexes": {
    "hash": {
      "type": "hash"
    }
  },
  "tables": {
    "corder": {
      "column_vindexes": [
        {
          "column": "customer_id",
          "name": "hash"
        }
      ],
      "auto_increment": {
        "column": "order_id",
        "sequence": "order_seq"
      }
    },
    "customer": {
      "column_vindexes": [
        {
          "column": "customer_id",
          "name": "hash"
        }
      ],
      "auto_increment": {
        "column": "customer_id",
        "sequence": "customer_seq"
      }
    },
    "product": {
      "column_vindexes": [
        {
          "column": "product_id",
          "name": "hash"
        }
      ],
      "auto_increment": {
        "column": "product_id",
        "sequence": "product_seq"
      }
    }
  }
}
+--------------------+
| Tables_in_commerce |
+--------------------+
| customer_seq       |
| order_seq          |
| product            |
| product_seq        |
+--------------------+
+----------+-------------+-------+------------+
+----------+-------------+-------+------------+
| sku      | description | price | product_id |
+----------+-------------+-------+------------+
| SKU-1001 | Monitor     |   100 |          1 |
| SKU-1002 | Keyboard    |    30 |          2 |
| SKU-1003 | Mouse       |     5 |          3 |
+----------+-------------+-------+------------+
+----+---------+-------+
| id | next_id | cache |
+----+---------+-------+
|  0 |       4 |  1000 |
+----+---------+-------+
+----------+-------------+-------+------------+
| sku      | description | price | product_id |
+----------+-------------+-------+------------+
| SKU-1001 | Monitor     |   100 |          1 |
| SKU-1002 | Keyboard    |    30 |          2 |
| SKU-1003 | Mouse       |     5 |          3 |
| SKU-1004 | Table       |   199 |          4 |
| SKU-1005 | Chair       |    99 |          5 |
+----------+-------------+-------+------------+
+----+---------+-------+
| id | next_id | cache |
+----+---------+-------+
|  0 |    1004 |  1000 |
+----+---------+-------+

The strip_sharded_auto_increment field was renamed to sharded_auto_increment_handling — which is upgrade/downgrade safe with protobufs as field indexes are used — and the type of the proto field was changed from bool to an enum/int32 to support the new REPLACE option. This is upgrade/downgrade safe because the wire format for bool is a varint with 0 being false and 1 being true — with the deserialization ending up 0 = false > 0 = true. To demonstrate:

git checkout vrepl_replace_auto_inc && make build 
cp bin/vtctldclient /tmp

git checkout main && make build
cd examples/local
./101_initial_cluster.sh; mysql < ../common/insert_commerce_data.sql; ./201_customer_tablets.sh

❯ vtctldclient MoveTables --workflow commerce2customer --target-keyspace customer create --source-keyspace commerce --tables product --remove-sharded-auto-increment=REPLACE --global-keyspace commerce
E0930 17:51:16.473295   29269 main.go:56] invalid argument "REPLACE" for "--remove-sharded-auto-increment" flag: strconv.ParseBool: parsing "REPLACE": invalid syntax

❯ /tmp/vtctldclient --server=localhost:15999  MoveTables --workflow commerce2customer --target-keyspace customer create --source-keyspace commerce --tables product --sharded-auto-increment-handling=REPLACE --global-keyspace commerce
The following vreplication streams exist for workflow customer.commerce2customer:

id=1 on customer/zone1-200: Status: Copying. VStream has not started.

Traffic State: Reads Not Switched. Writes Not Switched

❯ command mysql -u root --socket ${VTDATAROOT}/vt_0000000$(vtctldclient GetTablets --keyspace customer --tablet-type primary --shard "0" | awk '{print $1}' | cut -d- -f2 | bc)/mysql.sock vt_customer -e "select * from _vt.vreplication\G"
*************************** 1. row ***************************
                   id: 1
             workflow: commerce2customer
               source: keyspace:"commerce" shard:"0" filter:{rules:{match:"product" filter:"select * from product"}}
                  pos: MySQL56/9fec34ea-7f75-11ef-8280-bf546605e365:1-45
             stop_pos: NULL
              max_tps: 0
  max_replication_lag: 0
                 cell: zone1
         tablet_types: replica,primary
         time_updated: 1727733136
transaction_timestamp: 0
                state: Running
              message:
              db_name: vt_customer
          rows_copied: 0
                 tags:
       time_heartbeat: 1727733136
        workflow_type: 1
       time_throttled: 0
  component_throttled:
     reason_throttled:
    workflow_sub_type: 0
 defer_secondary_keys: 0
              options: {"strip_sharded_auto_increment": true}

And using the manual test case from above but using binaries built on main with a vtctldclient binary from the PR branch:

...

❯ /tmp/vtctldclient --server :15999 MoveTables --workflow commerce2customer --target-keyspace customer create --source-keyspace commerce --tables product --sharded-auto-increment-handling=replace --global-keyspace commerce
The following vreplication streams exist for workflow customer.commerce2customer:

id=2 on customer/zone1-300: Status: Copying. VStream has not started.
id=2 on customer/zone1-401: Status: Copying. VStream has not started.

Traffic State: Reads Not Switched. Writes Not Switched

❯ mysql commerce/0 -e "show create table product\G"
*************************** 1. row ***************************
       Table: product
Create Table: CREATE TABLE `product` (
  `sku` varbinary(128) NOT NULL,
  `description` varbinary(128) DEFAULT NULL,
  `price` bigint DEFAULT NULL,
  `product_id` int unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`product_id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

❯ mysql customer/-80 -e "show create table product\G"
*************************** 1. row ***************************
       Table: product
Create Table: CREATE TABLE `product` (
  `sku` varbinary(128) NOT NULL,
  `description` varbinary(128) DEFAULT NULL,
  `price` bigint DEFAULT NULL,
  `product_id` int unsigned NOT NULL,
  PRIMARY KEY (`product_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

So we can see that the auto_increment clauses were still removed.

Related Issue(s)

Checklist

vitess-bot[bot] commented 1 month ago

Review Checklist

Hello reviewers! :wave: Please follow this checklist when reviewing this Pull Request.

General

Tests

Documentation

New flags

If a workflow is added or modified:

Backward compatibility

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 68.27586% with 46 lines in your changes missing coverage. Please review.

Project coverage is 69.43%. Comparing base (945124a) to head (1d8601a). Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
go/vt/vtctl/workflow/traffic_switcher.go 56.41% 34 Missing :warning:
...ldclient/command/vreplication/movetables/create.go 0.00% 7 Missing :warning:
go/vt/vtctl/workflow/server.go 81.81% 2 Missing :warning:
go/vt/vtctl/workflow/utils.go 60.00% 2 Missing :warning:
go/vt/vtctl/workflow/materializer.go 96.15% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #16860 +/- ## ======================================== Coverage 69.42% 69.43% ======================================== Files 1571 1571 Lines 203304 203433 +129 ======================================== + Hits 141148 141249 +101 - Misses 62156 62184 +28 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

deepthi commented 1 month ago

We'll run the website docs update for the vtctld command reference once before release.