matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.78k stars 276 forks source link

[Bug]: mo_cdc: mo reported panic runtime error: invalid memory address or nil pointer dereference #19378

Open heni02 opened 1 week ago

heni02 commented 1 week ago

Is there an existing issue for the same bug?

Branch Name

main

Commit ID

b848638

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

resume恢复7亿行全量数据,cn panic报错重启 {"level":"INFO","time":"2024/10/16 06:40:13.798010 +0000","caller":"cdc/reader.go:98","msg":"cdc tableReader(test_db(272585).test01(272586) -> back_ac1_db.test01).Run: end"} panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4db5f65]

goroutine 133637 gp=0xc0337a0540 m=0 mp=0xa050980 [running]: panic({0x6589a00?, 0x9e7d9b0?}) /usr/local/go/src/runtime/panic.go:804 +0x16f fp=0xc009cd7148 sp=0xc009cd7098 pc=0x48452f runtime.panicmem() /usr/local/go/src/runtime/panic.go:262 +0x3e fp=0xc009cd7168 sp=0xc009cd7148 pc=0x447c1e runtime.sigpanic() /usr/local/go/src/runtime/signal_unix.go:900 +0x245 fp=0xc009cd7198 sp=0xc009cd7168 pc=0x486c65 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.updateCNDataBatch(0x0, {0x0, 0x0, 0x0, 0x0, 0x33, 0xd4, 0x1f, 0x79, 0x61, ...}, ...) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:899 +0x65 fp=0xc009cd7260 sp=0xc009cd7198 pc=0x4db5f65 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(CNObjectHandle).Next(0xc021191b30, {0x77245b8, 0xc06c8cf410}, 0xc009cd75f8, 0xc0072e7880) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:192 +0x2e5 fp=0xc009cd74e8 sp=0xc009cd7260 pc=0x4dae705 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(CNObjectHandle).QuickNext(0xc021191b30, {0x77245b8, 0xc06c8cf410}, 0xc00411d5f8, 0xc0072e7880) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:227 +0x4b fp=0xc009cd7540 sp=0xc009cd74e8 pc=0x4daedeb github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(baseHandle).QuickNext(0xc02005fde0, {0x77245b8, 0xc06c8cf410}, 0xc00411d5f8, 0xc0072e7880) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:504 +0x1fd fp=0xc009cd75b0 sp=0xc009cd7540 pc=0x4db189d github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(ChangeHandler).quickNext(0xc02005fd80, {0x77245b8, 0xc06c8cf410}, 0xc0072e7880) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:631 +0x73 fp=0xc009cd7620 sp=0xc009cd75b0 pc=0x4db2cb3 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(ChangeHandler).Next(0xc02005fd80, {0x77245b8, 0xc06c8cf410}, 0xc0072e7880) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:644 +0xbf fp=0xc009cd7700 sp=0xc009cd7620 pc=0x4db2e7f github.com/matrixorigin/matrixone/pkg/cdc.(tableReader).readTableWithTxn(0xc0254567e0, {0x77245b8, 0xc06c8cf410}, {0x77c22a8, 0xc020e16008}, 0xc022cbeff0, 0xc00eafca90) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:251 +0xa8c fp=0xc009cd7b90 sp=0xc009cd7700 pc=0x569958c github.com/matrixorigin/matrixone/pkg/cdc.(tableReader).readTable(0xc0254567e0, {0x77245b8, 0xc06c8cf410}, 0xc00eafca90) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:153 +0x51c fp=0xc009cd7d88 sp=0xc009cd7b90 pc=0x569891c github.com/matrixorigin/matrixone/pkg/cdc.(tableReader).Run(0xc0254567e0, {0x77245b8, 0xc06c8cf410}, 0xc00eafca90) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:112 +0x298 fp=0xc009cd7f88 sp=0xc009cd7d88 pc=0x5697f38 github.com/matrixorigin/matrixone/pkg/frontend.(CdcTask).addExecPipelineForTable.gowrap1() /go/src/github.com/matrixorigin/matrixone/pkg/frontend/cdc.go:1393 +0x62 fp=0xc009cd7fe0 sp=0xc009cd7f88 pc=0x5792102 runtime.goexit({}) /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc009cd7fe8 sp=0xc009cd7fe0 pc=0x48d101 created by github.com/matrixorigin/matrixone/pkg/frontend.(CdcTask).addExecPipelineForTable in goroutine 1442 /go/src/github.com/matrixorigin/matrixone/pkg/frontend/cdc.go:1393 +0x77a

goroutine 1 gp=0xc0000081c0 m=nil [select, 209 minutes]: runtime.gopark(0x6ffe498, 0x0, 0x9, 0x3, 0x1) /usr/local/go/src/runtime/proc.go:424 +0xfc fp=0xc01de259a8 sp=0xc01de25978 pc=0x48493c runtime.selectgo(0xc01de25c40, 0xc0032afb38, 0x2?, 0x0, 0x0?, 0x1) /usr/local/go/src/runtime/select.go:335 +0xa45 fp=0xc01de25af8 sp=0xc01de259a8 pc=0x45eb05 main.waitSignalToStop(0xc0002cfb00, 0xc000758b60) /go/src/github.com/matrixorigin/matrixone/cmd/mo-service/main.go:136 +0x23d fp=0xc01de25d10 sp=0xc01de25af8 pc=0x5bd615d main.main() /go/src/github.com/matrixorigin/matrixone/cmd/mo-service/main.go:125 +0x368 fp=0xc01de25f78 sp=0xc01de25d10 pc=0x5bd5c48 runtime.main() /usr/local/go/src/runtime/proc.go:272 +0x247 fp=0xc01de25fe0 sp=0xc01de25f78 pc=0x44c567 runtime.goexit({}) /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc01de25fe8 sp=0xc01de25fe0 pc=0x48d101

cn panic log: panic1016.log

mo log: http://10.222.6.1/explore?panes=%7B%22VCp%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-cdc-test%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221729060812932%22,%22to%22:%221729060813804%22%7D%7D%7D&schemaVersion=1&orgId=1

Expected Behavior

No response

Steps to Reproduce

create table test01(a int auto_increment primary key,b int);
./start.sh -h 10.222.6.6 -b test_db -c cases/ddl/
./mo_cdc task create --task-name "cdc_resume" --source-uri="mysql://dump:111@10.222.6.6:6001" --sink-type="mysql" --sink-uri="mysql://dump:111@10.222.1.129:3306"    --tables='test_db.test01:back_ac1_db.test01' --level="account"  --account="sys"
过几分钟暂停:./mo_cdc task pause --task-name "cdc_resume" --source-uri="mysql://dump:111@10.222.6.6:6001"
./mo_cdc task show --task-name "cdc_resume" --source-uri="mysql://dump:111@10.222.6.6:6001"
等待1个小时恢复7亿数据:./mo_cdc task resume --task-name "cdc_resume" --source-uri="mysql://dump:111@10.222.6.6:6001"

Additional information

No response

jiangxinmeng1 commented 6 days ago

fixed by #19385

heni02 commented 15 hours ago

验证resume5亿多数据还是报相同的panic错误,同步数据也不一致 commit: 72b1061 同步数据不一致:

企业微信截图_9175a8f7-efcc-4990-942a-5ec1d2e017ce 企业微信截图_8003e172-662e-4f4c-8965-912b51648f2f

panic error: panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x33dda1f]

goroutine 395163 gp=0xc0286a2700 m=11 mp=0xc001980008 [running]: panic({0x4697980?, 0x83709d0?}) /usr/local/go/src/runtime/panic.go:804 +0x168 fp=0xc00ff77718 sp=0xc00ff77668 pc=0x47bb68 runtime.panicmem(...) /usr/local/go/src/runtime/panic.go:262 runtime.sigpanic() /usr/local/go/src/runtime/signal_unix.go:900 +0x359 fp=0xc00ff77778 sp=0xc00ff77718 pc=0x47e219 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(CNObjectHandle).prefetch(0xc0669cc180, {0x5814f18, 0xc0312f73b0}) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:238 +0x4bf fp=0xc00ff778f0 sp=0xc00ff77778 pc=0x33dda1f github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(CNObjectHandle).Next(0xc0669cc180, {0x5814f18?, 0xc0312f73b0?}, 0xc00ff77a38, 0xc00803c380) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:265 +0x7c fp=0xc00ff779d0 sp=0xc00ff778f0 pc=0x33ddd7c github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(CNObjectHandle).QuickNext(...) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:315 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(baseHandle).QuickNext(0xc06488a240, {0x5814f18, 0xc0312f73b0}, 0xc008c67a38, 0xc00803c380) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:642 +0x185 fp=0xc00ff77a08 sp=0xc00ff779d0 pc=0x33e0be5 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(ChangeHandler).quickNext(0xc02bdb5d60, {0x5814f18, 0xc0312f73b0}, 0xc00803c380) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:783 +0x4c fp=0xc00ff77a50 sp=0xc00ff77a08 pc=0x33e1e4c github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(ChangeHandler).Next(0xc02bdb5d60, {0x5814f18, 0xc0312f73b0}, 0xc00803c380) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:804 +0x238 fp=0xc00ff77b38 sp=0xc00ff77a50 pc=0x33e2178 github.com/matrixorigin/matrixone/pkg/cdc.(tableReader).readTableWithTxn(0xc0345e0b40, {0x5814f18, 0xc0312f73b0}, {0x58de5e8, 0xc02c51e008}, 0xc039a00720, 0xc03be1d3b0) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:251 +0x682 fp=0xc00ff77dd0 sp=0xc00ff77b38 pc=0x398e802 github.com/matrixorigin/matrixone/pkg/cdc.(tableReader).readTable(0xc0345e0b40, {0x5814f18, 0xc0312f73b0}, 0xc03be1d3b0) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:153 +0x226 fp=0xc00ff77ec0 sp=0xc00ff77dd0 pc=0x398e006 github.com/matrixorigin/matrixone/pkg/cdc.(tableReader).Run(0xc0345e0b40, {0x5814f18, 0xc0312f73b0}, 0xc03be1d3b0) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:111 +0x174 fp=0xc00ff77fb0 sp=0xc00ff77ec0 pc=0x398da94 github.com/matrixorigin/matrixone/pkg/frontend.(CdcTask).addExecPipelineForTable.gowrap1() /go/src/github.com/matrixorigin/matrixone/pkg/frontend/cdc.go:1424 +0x31 fp=0xc00ff77fe0 sp=0xc00ff77fb0 pc=0x3a29891 runtime.goexit({}) /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00ff77fe8 sp=0xc00ff77fe0 pc=0x484c21 created by github.com/matrixorigin/matrixone/pkg/frontend.(*CdcTask).addExecPipelineForTable in goroutine 394912 /go/src/github.com/matrixorigin/matrixone/pkg/frontend/cdc.go:1424 +0x3e5

mo log: https://shanghai.idc.matrixorigin.cn:30001/explore?panes=%7B%22M2q%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-cdc-test%5C%22%7D%20%7C%3D%20%60panic%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221729605651000%22,%22to%22:%221729607451000%22%7D%7D%7D&schemaVersion=1&orgId=1