matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.78k stars 276 forks source link

[Bug]: panic: cannot allocate memory during TPCC 1000-1000 on standlone mode but the real memory is not #19596

Open aressu1985 opened 2 days ago

aressu1985 commented 2 days ago

Is there an existing issue for the same bug?

Branch Name

main

Commit ID

57098bef4e96af0b66655e8eba28a012bbad7566

Other Environment Information

- Hardware parameters: 64C 256G
- OS type:
- Others:

Actual Behavior

job link: https://github.com/matrixorigin/mo-nightly-regression/actions/runs/11501553064/job/32023695611

image

内存:

image

dashborad link: https://shanghai.idc.matrixorigin.cn:30001/d/rYdddlPWk/node-exporter-full?orgId=1&var-datasource=prometheus-standalone&var-job=agents&var-node=10.222.1.128:9100&var-diskdevices=%5Ba-z%5D%2B%7Cnvme%5B0-9%5D%2Bn%5B0-9%5D%2B%7Cmmcblk%5B0-9%5D%2B&from=1729825200000&to=1729828799000

mo-log: {"level":"INFO","time":"2024/10/25 11:13:59.099198 +0800","name":"cn-service","caller":"frontend/status_stmt.go:167","msg":"time of Exec.Run : 3.395500144s","service":"dd1dccb4-4d3c-41f8-b482-5251dc7a41bf","uuid":"dd1dccb4-4d3c-41f8-b482-5251dc7a41bf","session_info":"connectionId 91913|127.0.0.1:44866|account sys:dump|goRoutineId 86634325|migrate-goRoutineId 0|0192c1a1-1e90-7925-bf2e-e04076ec0706","role":"moadmin","session_id":"0192c1a1-1e90-7925-bf2e-e04076ec0706","statement_id":"0192c1aa-8759-7699-b8d6-b145d3c08836","txn_id":"70f70f4866a9b62a180169d93abc543f","span":{"trace_id":"a8b1cb32-a70a-0a5b-a1cb-3dbd6b6fb04e","span_id":"0c54cd7d94512d4e"}} {"level":"INFO","time":"2024/10/25 11:13:59.331773 +0800","caller":"disttae/txn.go:1239","msg":"Transaction.Rollback","txn":"70f70f4866a9b62a180169d93abc5496"} panic: cannot allocate memory

goroutine 86634604 [running]: github.com/matrixorigin/matrixone/pkg/common/malloc.NewFixedSizeMmapAllocator.func1(0x521c46?, 0xc10cf76990) /data1/runners/action-runner/_work/mo-nightly-regression/mo-nightly-regression/head/pkg/common/malloc/fixed_size_mmap_allocator.go:110 +0x195 github.com/matrixorigin/matrixone/pkg/common/malloc.NewFixedSizeMmapAllocator.NewClosureDeallocatorPool[...].func2.1(0xc657b64c60?) /data1/runners/action-runner/_work/mo-nightly-regression/mo-nightly-regression/head/pkg/common/malloc/closure_deallocator.go:59 +0x33 github.com/matrixorigin/matrixone/pkg/common/malloc.(ClosureDeallocator[...]).Deallocate(...) /data1/runners/action-runner/_work/mo-nightly-regression/mo-nightly-regression/head/pkg/common/malloc/closure_deallocator.go:34 github.com/matrixorigin/matrixone/pkg/common/malloc.(chainDeallocator).Deallocate(0xc992c97188, 0x0) /data1/runners/action-runner/_work/mo-nightly-regression/mo-nightly-regression/head/pkg/common/malloc/chain_deallocator.go:27 +0x43 github.com/matrixorigin/matrixone/pkg/common/malloc.(managedAllocatorShard).deallocate(0xc023bbc208, 0x7fbb16111000, 0x0) /data1/runners/action-runner/_work/mo-nightly-regression/mo-nightly-regression/head/pkg/common/malloc/managed_allocator.go:77 +0xa3 github.com/matrixorigin/matrixone/pkg/common/malloc.(ManagedAllocator[...]).Deallocate(0x50ab14d?, {0x7fbb16111000?, 0xc000275440?, 0x59cdea0?}, 0xc367de7bf0?) /data1/runners/action-runner/_work/mo-nightly-regression/mo-nightly-regression/head/pkg/common/malloc/managed_allocator.go:61 +0x70 github.com/matrixorigin/matrixone/pkg/common/mpool.(MPool).Free(0xc444fc4fc0, {0x7fbb16111018, 0xc0263cd538?, 0x31750cb?}) /data1/runners/action-runner/_work/mo-nightly-regression/mo-nightly-regression/head/pkg/common/mpool/mpool.go:647 +0x1ca github.com/matrixorigin/matrixone/pkg/container/vector.(Vector).Free(0xc37e6ac780, 0xc444fc4fc0) /data1/runners/action-runner/_work/mo-nightly-regression/mo-nightly-regression/head/pkg/container/vector/vector.go:583 +0x3e github.com/matrixorigin/matrixone/pkg/container/vector.(FunctionResult[...]).Free(...) /data1/runners/action-runner/_work/mo-nightly-regression/mo-nightly-regression/head/pkg/container/vector/functionTools.go:622 github.com/matrixorigin/matrixone/pkg/sql/colexec.(FunctionExpressionExecutor).Free(0xc2647700d0) /data1/runners/action-runner/_work/mo-nightly-regression/mo-nightly-regression/head/pkg/sql/colexec/evalExpression.go:633 +0x3c oom_panic.tar.gz

Expected Behavior

No response

Steps to Reproduce

RUN TPCC 1000-10000

Additional information

No response

volgariver6 commented 2 days ago

@reusee will help on this issue

reusee commented 1 day ago

从图像看,是有缓慢的泄漏,猜测是minio SDK的问题。等这个pr合并之后,再看看有没有复现:https://github.com/matrixorigin/matrixone/pull/19573