matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.79k stars 276 forks source link

[Bug]: cn crashed by anic: runtime error: invalid memory address or nil pointer dereference during tpcc and sysbench on daily regression TKE #19046

Closed aressu1985 closed 1 month ago

aressu1985 commented 2 months ago

Is there an existing issue for the same bug?

Branch Name

main

Commit ID

49668ee5dd2e89379d436bcff5921d34bacdb2de

Other Environment Information

- Hardware parameters:
3*CN: 16C 64G
1*DN: 16C 64G
3*LOG: 4C 16G
3*PROXY: 3C 7G
- OS type:
- Others:

Actual Behavior

job link: https://github.com/matrixorigin/mo-nightly-regression/actions/runs/11054477372/job/30720931583

image image

mo-log: https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22IiS%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-main-nightly-49668ee5d-20240926%5C%22%7D%20%7C%3D%20%60panic%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%22now-12h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1

image image

Expected Behavior

No response

Steps to Reproduce

DAILY REGRESSION

Additional information

No response

Ariznawlll commented 2 months ago

128 regression commit:ca577e56c97e464ba0ac6a89137470c62c94327f

job url: https://github.com/matrixorigin/mo-nightly-regression/actions/runs/11072489250/job/30767057745

mo第一次重启的地方:(对应utc+8时间为2024-09-28 04:00:56)

image

重启前的log:https://shanghai.idc.matrixorigin.cn:30001/explore?panes=%7B%22aRY%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bhost%3D%5C%2210-222-1-128%5C%22,%20filename%3D%5C%22%2Fdata1%2Frunners%2Faction-runner%2F_work%2Fmo-nightly-regression%2Fmo-nightly-regression%2Fhead%2Fmo-service-ca577e5-20240927-223037.log%5C%22%7D%20%7C%3D%20%60panic%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221727449200000%22,%22to%22:%221727467800000%22%7D%7D%7D&schemaVersion=1&orgId=1

堆栈: 重启前,对应utc+8时间(2024-09-28 04:00:00) LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_heap_01923511-f08e-7945-a70f-fd85f73d8158.gz

重启后,对应utc+8时间(2024-09-28 04:07:00) LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_heap_01923518-b559-7d1a-8230-166a42bd92bc.gz

xzxiong commented 2 months ago

以下才是有效信息

https://grafana.ci.matrixorigin.cn/goto/geyLaWzNR?orgId=1 image

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x27258f5]

goroutine 246 gp=0xc0004856c0 m=31 mp=0xc0106d1c08 [running]:
panic({0x44a3a80?, 0x7ffc8f0?})
    /usr/local/go/src/runtime/panic.go:804 +0x168 fp=0xc00828cbd8 sp=0xc00828cb28 pc=0x47b268
runtime.panicmem(...)
    /usr/local/go/src/runtime/panic.go:262
runtime.sigpanic()
    /usr/local/go/src/runtime/signal_unix.go:900 +0x359 fp=0xc00828cc38 sp=0xc00828cbd8 pc=0x47d919
github.com/matrixorigin/matrixone/pkg/util/trace/impl/motrace.(*ItemSyncer).GetTable(0xc1af991f50?)
    /go/src/github.com/matrixorigin/matrixone/pkg/util/trace/impl/motrace/syncer.go:65 +0x15 fp=0xc00828cc50 sp=0xc00828cc38 pc=0x27258f5
github.com/matrixorigin/matrixone/pkg/util/trace/impl/motrace.(*ContentBuffer).Add(0xc32006a880, {0x5561ec0, 0xc1af991f50})
    /go/src/github.com/matrixorigin/matrixone/pkg/util/trace/impl/motrace/buffer_content.go:117 +0x52e fp=0xc00828cdd8 sp=0xc00828cc50 pc=0x271482e
github.com/matrixorigin/matrixone/pkg/util/export.(*bufferHolder).Add(0xc0005b4000, {0x5561ec0, 0xc1af991f50})
    /go/src/github.com/matrixorigin/matrixone/pkg/util/export/batch_processor.go:183 +0x123 fp=0xc00828ce20 sp=0xc00828cdd8 pc=0x3a1aa63
github.com/matrixorigin/matrixone/pkg/util/export.(*MOCollector).doCollect(0xc00836e000, 0x1)
    /go/src/github.com/matrixorigin/matrixone/pkg/util/export/batch_processor.go:590 +0x3e5 fp=0xc00828cfc0 sp=0xc00828ce20 pc=0x3a1cc05
github.com/matrixorigin/matrixone/pkg/util/export.(*MOCollector).Start.gowrap3()
    /go/src/github.com/matrixorigin/matrixone/pkg/util/export/batch_processor.go:511 +0x25 fp=0xc00828cfe0 sp=0xc00828cfc0 pc=0x3a1c625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00828cfe8 sp=0xc00828cfe0 pc=0x484321
created by github.com/matrixorigin/matrixone/pkg/util/export.(*MOCollector).Start in goroutine 244
    /go/src/github.com/matrixorigin/matrixone/pkg/util/export/batch_processor.go:511 +0x125
xzxiong commented 2 months ago

128 regression commit:ca577e56c97e464ba0ac6a89137470c62c94327f

job url: https://github.com/matrixorigin/mo-nightly-regression/actions/runs/11072489250/job/30767057745

mo第一次重启的地方:(对应utc+8时间为2024-09-28 04:00:56) image

重启前的log:https://shanghai.idc.matrixorigin.cn:30001/explore?panes=%7B%22aRY%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bhost%3D%5C%2210-222-1-128%5C%22,%20filename%3D%5C%22%2Fdata1%2Frunners%2Faction-runner%2F_work%2Fmo-nightly-regression%2Fmo-nightly-regression%2Fhead%2Fmo-service-ca577e5-20240927-223037.log%5C%22%7D%20%7C%3D%20%60panic%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221727449200000%22,%22to%22:%221727467800000%22%7D%7D%7D&schemaVersion=1&orgId=1

堆栈: 重启前,对应utc+8时间(2024-09-28 04:00:00) LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_heap_01923511-f08e-7945-a70f-fd85f73d8158.gz

重启后,对应utc+8时间(2024-09-28 04:07:00) LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_heap_01923518-b559-7d1a-8230-166a42bd92bc.gz

待确认,目测与本issue 无关联 (panic 如下)

企业微信截图_fa49d6f2-0264-47d1-8da2-32482d25436c
xzxiong commented 2 months ago

woking on

xzxiong commented 1 month ago

woking on

xzxiong commented 1 month ago

root cause: item对象在使用过程中,被提前释放了,导致nil panic merged

aressu1985 commented 1 month ago

fixed