Open Lily2025 opened 3 months ago
/remove-area dm /area ticdc
/assign sdojjy
As we can see from the below logs, ticdc async execute add index ddl and wait 10s, The background execution is not finished after 10s, so the ticdc advanced the checkpoint ts, but the ddl is not submitted to downstream tidb, because the network partition.
[2024/08/20 13:55:32.098 +08:00] [INFO] [async_ddl.go:51] ["async exec add index ddl start"] [changefeedID=default/ticdc-task1] [commitTs=451971110139330738] [ddl="ALTER TABLE `sbtest5` ADD INDEX `index_test_1724132823395`(`c`)"]
[2024/08/20 13:55:42.098 +08:00] [INFO] [async_ddl.go:87] ["async add index ddl is still running"] [changefeedID=default/ticdc-task1] [commitTs=451971110139330738] [ddl="ALTER TABLE `sbtest5` ADD INDEX `index_test_1724132823395`(`c`)"]
[2024/08/20 13:56:35.530 +08:00] [ERROR] [async_ddl.go:57] ["async exec add index ddl failed"] [changefeedID=default/ticdc-task1] [commitTs=451971110139330738] [ddl="ALTER TABLE `sbtest5` ADD INDEX `index_test_1724132823395`(`c`)"]
/severity major
Will further investigate the issue on the new arch ticdc https://github.com/pingcap/ticdc. Won't fix on the current repo.
What did you do?
1、restore data for primary and secondary 2、create changefeed and set bdr role for primary and secondary 3、run sysbench on primary and secondary 4、add index a on primary 5、simulate network partition between upstream and downstream faultType: network_partition selector: tc-ticdc(all)_to_cdc-downstream-tc-tidb(all) warmUpTime: 1m period: "@every 5m" faultDuration: 3m faultTotalRunTime: 30m
log: ticdc-0.zip
What did you expect to see?
after fault recover,add index can sync to downstream
What did you see instead?
after fault recover,add index can not sync to downstream upstream:
dowdownstream:
logs: [2024/08/20 13:55:42.098 +08:00] [INFO] [ddl_sink.go:258] ["Execute DDL succeeded"] [namespace=default] [changefeed=ticdc-task1] [DDL="{\"StartTs\":451971109864079467,\"CommitTs\":451971110139330738,\"Query\":\"ALTER TABLE
sbtest5
ADD INDEXindex_test_1724132823395
(c
)\",\"TableInfo\":{\"id\":325,\"name\":{\"O\":\"sbtest5\",\"L\":\"sbtest5\"},\"charset\":\"utf8mb4\",\"collate\":\"utf8mb4_bin\",\"cols\":[{\"id\":1,\"name\":{\"O\":\"id\",\"L\":\"id\"},\[2024/08/20 13:56:35.529 +08:00] [WARN] [mysql_ddl_sink.go:152] ["Execute DDL with error, retry later"] [startTs=451971109864079467] [ddl="ALTER TABLE
sbtest5
ADD INDEXindex_test_1724132823395
(c
)"] [namespace=default] [changefeed=ticdc-task1] [error="dial tcp 10.101.73.218:4000: operation was canceled"]Versions of the cluster
./cdc version Release Version: v8.3.0-alpha Git Commit Hash: e3c75b756bedcaa39285ddd6d370b3877d7433d3 Git Branch: heads/refs/tags/v8.3.0-alpha UTC Build Time: 2024-08-19 11:36:49 Go Version: go1.21.10 Failpoint Build: false 2024-08-20T10:25:02.344+0800
current status of DM cluster (execute
query-status <task-name>
in dmctl)No response