pingcap / tiflow

This repo maintains DM (a data migration platform) and TiCDC (change data capture for TiDB)
Apache License 2.0
424 stars 283 forks source link

redo apply failed due to panic: runtime error: invalid memory address or nil pointer dereference #11323

Closed fubinzh closed 3 months ago

fubinzh commented 3 months ago

What did you do?

  1. create mysql sink changefeed with eventual consistency on.
  2. run workload
    /go-tpc tpcc --db workload --warehouses 200 -T 50 --host upstream-tidb.cdc-redo-long-duration-tps-7589946-1-17 --port 4000 --parts 1 prepare --ignore-error '2013,1213,1105,1205,8022,8028,9004,9007,1062'
    /go-tpc tpcc --db workload --warehouses 200 -T 50 --host upstream-tidb.cdc-redo-long-duration-tps-7589946-1-17 --port 4000 --parts 1 --time 24h0m0s run --ignore-error '2013,1213,1105,1205,8022,8028,9004,9007,1062'
  3. after run workload for 30m, pause changefeed and run run redo apply
    /cdc  redo  apply "--sink-uri=mysql://root:@downstream.cdc-redo-long-duration-tps-7590398-1-219:3306" "--storage=s3://tmp/test-infra-redolog/redo-apply-multiple-cpo1eikpcjimem5viqm0-2024-06-17 11:00:26.160182053 +0000 UTC m=+31.509143689?access-key=xxx&secret-access-key=xxx&endpoint=http://minio-peer:9000&force-path-style=true" "--tmp-dir=/tmp/nfs/redo-apply-multiple-cpo1eikpcjimem5viqm02024-06-17-11-00-26"

What did you expect to see?

What did you expect to see?

redo apply should succeed.

What did you see instead?

redo apply failed.

[INFO] [file.go:113] [\"succeed to download and sort redo logs\"] [type=ddl] [duration=175.509756ms]
[2024/06/17 11:53:19.471 +00:00] [INFO] [file.go:283] [\"ignore logs which commitTs is greater than resolvedTs\"] [filename=upstream-ticdc-0.upstream-ticdc-peer.cdc-redo-long-duration-tps-7590398-1-219.svc:8301_redo-apply-multiple-cpo1eikpcjimem5viqm0_row_450527242397679753_3dac706f-c2d0-4ca4-96e4-2b19548354aa.log] [endTs=450527242384310384]
[2024/06/17 11:55:58.706 +00:00] [INFO] [file.go:283] [\"ignore logs which commitTs is greater than resolvedTs\"] [filename=upstream-ticdc-1.upstream-ticdc-peer.cdc-redo-long-duration-tps-7590398-1-219.svc:8301_redo-apply-multiple-cpo1eikpcjimem5viqm0_row_450527242384310548_44294131-06b2-42a6-b66a-755f78936780.log] [endTs=450527242384310384]
[2024/06/17 11:55:58.914 +00:00] [INFO] [file.go:113] [\"succeed to download and sort redo logs\"] [type=row] [duration=5m15.83491318s], stderr: panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x32fe94c]\n\ngoroutine 516 [running]:\ngithub.com/pingcap/tiflow/cdc/sinkv2/eventsink/txn.(*worker).onEvent(0xc003068c00, 0xc0150cdf40, 0xc018f95d00)\n\tgithub.com/pingcap/tiflow/cdc/sinkv2/eventsink/txn/worker.go:177 +0x28c\ngithub.com/pingcap/tiflow/cdc/sinkv2/eventsink/txn.(*worker).run(0xc003068c00, 0xc000b16f60)\n\tgithub.com/pingcap/tiflow/cdc/sinkv2/eventsink/txn/worker.go:124 +0xb9b\ngithub.com/pingcap/tiflow/cdc/sinkv2/eventsink/txn.newSink.func1()\n\tgithub.com/pingcap/tiflow/cdc/sinkv2/eventsink/txn/txn_sink.go:116 +0x25\ngolang.org/x/sync/errgroup.(*Group).Go.func1()\n\tgolang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x64\ncreated by golang.org/x/sync/errgroup.(*Group).Go\n\tgolang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0xa5, ExitCode: 2"] [stack="github.com/pingcap/test-infra/caselib/pkg/host.(*TiCDCHost).Redo\n\tgithub.com/pingcap/test-infra/caselib/pkg/host/ticdc.go:107\ngithub.com/pingcap/test-infra/caselib/pkg/steps.(*redo).Execute\n\tgithub.com/pingcap/test-infra/caselib/pkg/steps/redo.go:28\ngithub.com/pingcap/test-infra/caselib/pkg/steps.withRecover\n\tgithub.com/pingcap/test-infra/caselib/pkg/steps/step.go:24\ngithub.com/pingcap/test-infra/caselib/pkg/steps.(*Serial).Execute\n\tgithub.com/pingcap/test-infra/caselib/pkg/steps/step.go:45\nmain.main\n\ttest-infra/caselib/ticdc/main.go:112\nruntime.main\n\truntime/proc.go:267"]
[2024/06/17 11:57:03.840 +00:00] [INFO] [redo.go:29] ["run redo cmd finished"] [result=null] [error="command terminated with exit code 2"] [errorVerbose="command terminated with exit code 2\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/juju_adaptor.go:15\ngithub.com/pingcap/test-infra/sdk/resource/impl/k8s.(*TiDBCluster).Exec\n\tgithub.com/pingcap/test-infra/sdk@v0.0.0-20240306052032-3cff273d546b/resource/impl/k8s/tidbcluster.go:199\ngithub.com/pingcap/test-infra/caselib/pkg/host.(*TiCDCHost).Exec.func1\n\tgithub.com/pingcap/test-infra/caselib/pkg/host/ticdc.go:303\nruntime.goexit\n\truntime/asm_amd64.s:1650"]

Versions of the cluster

Release Version: v6.5.10 Git Commit Hash: ffbe6f8b033b97eaeac3f9ee364f21be2649858b

fubinzh commented 3 months ago

/assign @hongyunyan

fubinzh commented 3 months ago

/label affects-6.5

fubinzh commented 3 months ago

/severity critical

jebter commented 3 months ago

Can the issue be closed?

fubinzh commented 3 months ago

closed as related PR reverted.