pingcap / tiflow

This repo maintains DM (a data migration platform) and TiCDC (change data capture for TiDB)
Apache License 2.0
430 stars 286 forks source link

secondary cluster cdc panic #11670

Open Lily2025 opened 1 month ago

Lily2025 commented 1 month ago

What did you do?

1、restore data for primary and secondary 2、create changefeed and set bdr role for primary and secondary 3、run sysbench on primary and secondary 4、add index and then drop index on primary 5、kill one of tikv on primary

What did you expect to see?

no panic

What did you see instead?

after sometime when kill one of tikv on primary,secondary cluster cdc panic log: Explore-logs-2024-10-23 15_13_57.txt

2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915681716+08:00 stderr F \tgolang.org/x/sync@v0.8.0/errgroup/errgroup.go:78 +0x50 fp=0xc006f21fe0 sp=0xc006f21f78 pc=0x1ac9f50","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.91567794+08:00 stderr F golang.org/x/sync/errgroup.(*Group).Go.func1()","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915670897+08:00 stderr F \tgithub.com/pingcap/tiflow/cdc/kv/shared_stream.go:213 +0x27 fp=0xc006f21f78 sp=0xc006f21f40 pc=0x3d0ca67","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915666246+08:00 stderr F github.com/pingcap/tiflow/cdc/kv.(*requestedStream).run.func5()","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915662339+08:00 stderr F \tgithub.com/pingcap/tiflow/cdc/kv/shared_stream.go:379 +0x1729 fp=0xc006f21f40 sp=0xc006f218f0 pc=0x3d0f0e9","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.91565457+08:00 stderr F github.com/pingcap/tiflow/cdc/kv.(*requestedStream).send(0xc004ceab40, {0x5e62ab0, 0xc003895810}, 0xc0071dc000, 0xc0067286c0)","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915650163+08:00 stderr F \tgithub.com/pingcap/tiflow/cdc/kv/shared_stream.go:262 +0x19f fp=0xc006f218f0 sp=0xc006f21750 pc=0x3d0f8df","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915645961+08:00 stderr F github.com/pingcap/tiflow/cdc/kv.(*requestedStream).send.func1()","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915641861+08:00 stderr F \truntime/select.go:335 +0x7a5 fp=0xc006f21750 sp=0xc006f21628 pc=0x4516a5","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915637366+08:00 stderr F runtime.selectgo(0xc006f21830, 0xc006f21784, 0x0?, 0x0, 0xc006f217d8?, 0x1)","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915630131+08:00 stderr F \truntime/proc.go:424 +0xce fp=0xc006f21628 sp=0xc006f21608 pc=0x47604e","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915625382+08:00 stderr F runtime.gopark(0xc006f21830?, 0x3?, 0x70?, 0x16?, 0xc006f2178a?)","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915607012+08:00 stderr F goroutine 12359 gp=0xc00432e700 m=nil [select]:","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915602936+08:00 stderr F ","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915599237+08:00 stderr F \tgolang.org/x/sync@v0.8.0/errgroup/errgroup.go:75 +0x96","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915595505+08:00 stderr F created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 12333","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915591377+08:00 stderr F \truntime/asm_amd64.s:1700 +0x1 fp=0xc00b571fe8 sp=0xc00b571fe0 pc=0x47edc1","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915583169+08:00 stderr F runtime.goexit({})","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915578833+08:00 stderr F \tgolang.org/x/sync@v0.8.0/errgroup/errgroup.go:78 +0x50 fp=0xc00b571fe0 sp=0xc00b571f78 pc=0x1ac9f50","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915575155+08:00 stderr F golang.org/x/sync/errgroup.(*Group).Go.func1()","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.915570698+08:00 stderr F \tgithub.com/pingcap/tiflow/cdc/kv/shared_stream.go:189 +0x2e fp=0xc00b571f78 sp=0xc00b571f30 pc=0x3d0ccae","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.91556707+08:00 stderr F github.com/pingcap/tiflow/cdc/kv.(*requestedStream).run.func3()","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.9155629+08:00 stderr F \tgithub.com/pingcap/tiflow/cdc/kv/shared_stream.go:227 +0x75 fp=0xc00b571f30 sp=0xc00b571cb0 pc=0x3d0d255","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.91555365+08:00 stderr F github.com/pingcap/tiflow/cdc/kv.(*requestedStream).receive(0xc004ceab40, {0x5e62ab0, 0xc003895810}, 0xc0071dc000, 0xc0067286c0, 0x0?, 0x0)","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20 {"pod":"cdc-downstream-ticdc-0","container":"ticdc","log":"2024-10-17T14:44:56.91554985+08:00 stderr F \tgithub.com/pingcap/kvproto@v0.0.0-20240924080114-4a3e17f5e62d/pkg/cdcpb/cdcpb.pb.go:1683 +0x46 fp=0xc00b571cb0 sp=0xc00b571c80 pc=0x3cc4846","namespace":"endless-ha-test-bdr-ddl-tps-7635680-1-855"} 2024-10-17 14:45:20

Versions of the cluster

./cdc version Release Version: v8.4.0-alpha-40-g62d07b55e Git Commit Hash: 62d07b55e71d0f217886624859bd40eccfab04c8 Git Branch: HEAD UTC Build Time: 2024-10-16 17:13:21 Go Version: go1.23.2 Failpoint Build: false 2024-10-17T10:07:31.445+0800

current status of DM cluster (execute query-status <task-name> in dmctl)

No response

hicqu commented 1 month ago

图片 Seems it panics in golang internal libs.

flowbehappy commented 1 week ago

Will further investigate the issue on the new arch ticdc https://github.com/pingcap/ticdc. Won't fix on the current repo.