vitessio / vitess

Vitess is a database clustering system for horizontal scaling of MySQL.
http://vitess.io
Apache License 2.0
18.37k stars 2.08k forks source link

vreplication: support JSON columns without CONVERT #10084

Closed derekperkins closed 11 months ago

derekperkins commented 2 years ago

It's very unexpected when trying to vreplicate a json column that you get the error Error: Cannot create a JSON value from a string with CHARACTER SET 'binary'. There's a known workaround from the related issues that you can do CONVERT(json_col USING UTF8MB4), but that should be unnecessary.

Still encountering this on v13, no rush to fix since there's a workaround.

olyazavr commented 11 months ago

👋 Just a +1 here, we just moved over to Vitess online ddl and have encountered this issue in a migration (uses vreplication)

mattlord commented 11 months ago

Do either of you happen to have a test case, or are able to try and repeat this on v17/main? In v17 we moved vreplication to using the internal JSON parser and are no longer using CONVERT().

If this is still an issue on main then I'll prioritize this, but I suspect we can close this as resolved/fixed in v17+.

mattlord commented 11 months ago

@olyazavr I think your case may be different as I do see that OnlineDDL is still using it for JSON types: https://github.com/vitessio/vitess/blob/main/go/vt/vttablet/onlineddl/vrepl.go

I'd like to make this issue more concrete with a test case that shows the problem still exists on main. As noted, I'll then prioritize it.

Thanks!

olyazavr commented 11 months ago

Unfortunately the only data I have is for v14, I can try to see if I can repro in a unit test, but spinning up a v17+ tablet is a bit out of my reach at the moment. I did search around for any json-related changes in onlineddl since v14 but came up empty. Here's what I have for the moment:

The error: Cannot create a JSON value from a string with CHARACTER SET 'binary'. (errno 3144) (sqlstate 22032) during query: insert /*+ MAX_EXECUTION_TIME(3600000) */ into _767a8171_519c_11ee_bb0d_12771532d02f_20230912184455_vrepl(id,portalId,publicId,paymentsSessionId,lineItems,status,createdAt,updatedAt,discount,adjustments,orderPaymentMethod,currencyCode,liveMode,sourceId,sourceApp,processorType,checkoutSessionId) values (1,343,'8647518375','hsps_PrF0FEa44xZxj6KxY83CD3Xv0YABDrZJmtcjAUMOOTApBfud8beHOOYvJ7BOL1iY',convert('[{\"sku\": null, \"name\": \"End-to-End!\", \"price\": 20, \"amount\": 40, \"discount\": null, \"quantity\": 2, \"sourceId\": \"2251591\", \"description\": null, \"discountAmount\": null, \"sourceObjectType\": null, \"unadjustedAmount\": 40, \"recurringBillingTerms\": null}]' using utf8mb4),2,1608324359564,1678205197203,convert(null using utf8mb4),convert('[]' using utf8mb4),convert('{\"customerId\": \"cus_Ib08Y8EfG3vlMz\", \"paymentMethod\": 1, \"externalPaymentMethod\": 1, \"externalPaymentMethodId\": \"pm_1Hzo7pChThfTOgGjfsxbqSko\"}' using utf8mb4),'USD',1,'232',2,null,null), (2,104036150,'5868424440','hsps_test_DbJQ1VPIFGllXPq1M4Q7fTMmG10sXOMNdlWmwlgA7WKNHYsdwy8yzELuqSgc4viy',convert('[{\"sku\": null, \"name\": \"Monthly Consultation\", \"price\": 400, \"amount\": 400, \"discount\": null, \"quantity\": 1, \"sourceId\": \"5791308\", \"description\": null, \"discountAmount\": null, \"sourceObjectType\": null, \"unadjustedAmount\": 400, \"recurringBillingTerms\": null}]' using utf8mb4),3,1610484399051,1678292995196,convert(null using utf8mb4),convert('[]' using utf8mb4) (it goes on for a while)

The migration:

ALTER TABLE `orders`
    ADD COLUMN `originObjectReference` JSON DEFAULT NULL,
    ADD COLUMN `originObjectId` VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin GENERATED ALWAYS AS (JSON_UNQUOTE(JSON_EXTRACT(originObjectReference,'$.id'))) STORED,
    ADD COLUMN `originObjectType` TINYINT GENERATED ALWAYS AS (JSON_UNQUOTE(JSON_EXTRACT(originObjectReference,'$.type'))) STORED,
    LOCK=SHARED;}

Full schema:

CREATE TABLE `orders` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `portalId` int(10) unsigned NOT NULL,
  `publicId` varchar(32) NOT NULL,
  `paymentsSessionId` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
  `lineItems` json NOT NULL,
  `status` tinyint(3) unsigned NOT NULL COMMENT 'OrderStatus enum value. Current options: PENDING(1), PAID(2), FAILED(3)',
  `createdAt` bigint(20) NOT NULL,
  `updatedAt` bigint(20) NOT NULL,
  `discount` json DEFAULT NULL,
  `adjustments` json DEFAULT NULL COMMENT 'Contains order level adjustments JSON to represent taxes and fees',
  `orderPaymentMethod` json DEFAULT NULL COMMENT 'Payment method associated with order',
  `stripePaymentMethod` tinyint(4) GENERATED ALWAYS AS (json_unquote(json_extract(`orderPaymentMethod`,'$.paymentMethod'))) STORED COMMENT 'StripePaymentMethod enum value: CARD(1), ACH_DEBIT(2), US_BANK_ACCOUNT(3)',
  `externalPaymentMethodId` varchar(255) GENERATED ALWAYS AS (json_unquote(json_extract(`orderPaymentMethod`,'$.externalPaymentMethodId'))) STORED COMMENT 'Stores payment method ID',
  `currencyCode` char(3) CHARACTER SET ascii COLLATE ascii_bin DEFAULT NULL COMMENT 'Currency for this order',
  `liveMode` tinyint(1) DEFAULT NULL COMMENT 'True for real orders',
  `sourceId` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
  `sourceApp` tinyint(3) unsigned NOT NULL,
  `processorType` tinyint(3) unsigned DEFAULT NULL COMMENT 'PaymentsProcessorType enum value: HS_PAYMENTS(1), BYO_STRIPE(2)',
  `checkoutSessionId` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `portalId` (`portalId`,`publicId`),
  UNIQUE KEY `paymentsSessionId` (`paymentsSessionId`),
  KEY `idx_portalId_sourceApp_sourceId` (`portalId`,`sourceApp`,`sourceId`),
  KEY `checkoutSessionId` (`checkoutSessionId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8
olyazavr commented 11 months ago

I have a more concise test case with v14, and I'll try to get a tablet with code from main spun up to test:

CREATE TABLE `orders` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `orderPaymentMethod` json DEFAULT NULL,
  `stripePaymentMethod` tinyint(4) GENERATED ALWAYS AS (json_unquote(json_extract(`orderPaymentMethod`,'$.paymentMethod'))) STORED,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB
insert into orders(orderPaymentMethod) values (convert('{\"customerId\": \"cus_Il7WwbIODMoaE1\"}' using utf8mb4))
vtctl ApplySchema -- --skip_preflight --ddl_strategy "vitess" --sql "ALTER TABLE orders ENGINE=INNODB;" TestKeyspace

returns error:

Cannot create a JSON value from a string with CHARACTER SET 'binary'. (errno 3144) (sqlstate 22032) during query: insert into _34efe2fe_52fc_11ee_bbea_0e69fddf2eef_20230914124247_vrepl(id,orderPaymentMethod) values (1,convert('{\\\"customerId\\\": \\\"cus_Il7WwbIODMoaE1\\\"}' using utf8mb4))

If the json field is null, it works, and if the generated column is gone, it also works, so I think it's a combination of the two

olyazavr commented 11 months ago

ok yep, failed on v18: executor.go:4069] updateMigrationMessage: uuid=f3387b12_5323_11ee_8c7e_0a4b134e1d27, message=task error: failed inserting rows: Cannot create a JSON value from a string with CHARACTER SET 'binary'. (errno 3144) (sqlstate 22032) during query: insert into _f3387b12_5323_11ee_8c7e_0a4b134e1d27_20230914172717_vrepl(id,orderPaymentMethod) values (1,convert(JSON_OBJECT(_utf8mb4'customerId', _utf8mb4'cus_Il7WwbIODMoaE1') using utf8mb4))

executor.go:2082] cancelMigrations: cancelling f3387b12_5323_11ee_8c7e_0a4b134e1d27; reason: task error: failed inserting rows: Cannot create a JSON value from a string with CHARACTER SET 'binary'. (errno 3144) (sqlstate 22032) during query: insert into _f3387b12_5323_11ee_8c7e_0a4b134e1d27_20230914172717_vrepl(id,orderPaymentMethod) values (1,convert(JSON_OBJECT(_utf8mb4'customerId', _utf8mb4'cus_Il7WwbIODMoaE1') using utf8mb4))

controller.go:261] vreplication stream 5 going into error state due to Cannot create a JSON value from a string with CHARACTER SET 'binary'. (errno 3144) (sqlstate 22032) during query: insert into _f3387b12_5323_11ee_8c7e_0a4b134e1d27_20230914172717_vrepl(id,orderPaymentMethod) values (1,convert(JSON_OBJECT(_utf8mb4'customerId', _utf8mb4'cus_Il7WwbIODMoaE1') using utf8mb4))

@mattlord should I create a new issue?

olyazavr commented 11 months ago

new issue https://github.com/vitessio/vitess/issues/13986

mattlord commented 11 months ago

new issue #13986

Thank you, @olyazavr ! I'm going to close this issue then for now as I think this is only an issue in OnlineDDL in v17+ after https://github.com/vitessio/vitess/pull/12761. If anyone finds that to be untrue, please share the details here and we'll re-open it.

mattlord commented 11 months ago

Thanks again, @olyazavr ! Using a variation of your test case in the new issue...

Here's a test case that demonstrates it working with VReplication in v17+ — both in the copy and replication phases:

git checkout release-17.0
git pull
make build

pushd examples/local

./101_initial_cluster.sh

mysql -e "CREATE TABLE orders (
id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
orderPaymentMethod json DEFAULT NULL,
stripePaymentMethod tinyint(4) GENERATED ALWAYS AS (json_unquote(json_extract(orderPaymentMethod,'$.paymentMethod'))) STORED,
PRIMARY KEY (id)) ENGINE=InnoDB"

mysql commerce -e "insert into orders (orderPaymentMethod) values ('{\"name\":\"matt\", \"paymentMethod\":\"1\"}')"

./201_customer_tablets.sh

vtctlclient MoveTables -- --source commerce --tables 'orders' Create customer.commerce2customer

sleep 10

mysql commerce -e "insert into orders (orderPaymentMethod) values ('{\"name\":\"mattlord\", \"paymentMethod\":\"2\"}')"

sleep 1

vtctlclient Workflow -- customer.commerce2customer show

sleep 5

mysql customer:0 -e "select * from orders"

./401_teardown.sh
popd

The end results being:

❯ vtctlclient Workflow -- customer.commerce2customer show
{
    "Workflow": "commerce2customer",
    "SourceLocation": {
        "Keyspace": "commerce",
        "Shards": [
            "0"
        ]
    },
    "TargetLocation": {
        "Keyspace": "customer",
        "Shards": [
            "0"
        ]
    },
    "MaxVReplicationLag": 1,
    "MaxVReplicationTransactionLag": 1,
    "Frozen": false,
    "ShardStatuses": {
        "0/zone1-0000000201": {
            "PrimaryReplicationStatuses": [
                {
                    "Shard": "0",
                    "Tablet": "zone1-0000000201",
                    "ID": 1,
                    "Bls": {
                        "keyspace": "commerce",
                        "shard": "0",
                        "filter": {
                            "rules": [
                                {
                                    "match": "orders",
                                    "filter": "select * from orders"
                                }
                            ]
                        }
                    },
                    "Pos": "572ac4c4-532a-11ee-85f0-6933c132b4f3:1-54",
                    "StopPos": "",
                    "State": "Running",
                    "DBName": "vt_customer",
                    "TransactionTimestamp": 1694715608,
                    "TimeUpdated": 1694715638,
                    "TimeHeartbeat": 1694715638,
                    "TimeThrottled": 0,
                    "ComponentThrottled": "",
                    "Message": "",
                    "Tags": "",
                    "WorkflowType": "MoveTables",
                    "WorkflowSubType": "None",
                    "CopyState": null,
                    "RowsCopied": 1
                }
            ],
            "TabletControls": null,
            "PrimaryIsServing": true
        }
    },
    "SourceTimeZone": "",
    "TargetTimeZone": ""
}

❯ mysql customer:0 -e "select * from orders"
+----+--------------------------------------------+---------------------+
| id | orderPaymentMethod                         | stripePaymentMethod |
+----+--------------------------------------------+---------------------+
|  1 | {"name": "matt", "paymentMethod": "1"}     |                   1 |
|  2 | {"name": "mattlord", "paymentMethod": "2"} |                   2 |
+----+--------------------------------------------+---------------------+