Open aressu1985 opened 11 months ago
在 standalone 模式下,由于磁盘IO占用问题,就是会出现这个情况
无进展
无进展
date 12.18 standalone regresson reproduce this problem job:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/7249177604/job/19746773876
mo log:
log太大,私下发 当时cpu mem使用情况,mem基本占满内存
profile: profile (3).tar.gz
先改为s1,空了再处理这个问题
改成s0,因为有一些用户在使用单机版本
no process
no process
no process
no process
no process
no process
no process
no process
no process
pengzhen@pengzhen:~/Documents/temp/matrixone-temp$ ./mo-service -debug-http 127.0.0.1:6060 -launch etc/launch/launch.toml > log.txt 2024/05/08 15:15:45 maxprocs: Leaving GOMAXPROCS=16: CPU quota undefined [mysql] 2024/05/08 15:40:08 packets.go:37: read tcp 127.0.0.1:58362->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:40:25 packets.go:37: read tcp 127.0.0.1:55816->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:41:08 packets.go:37: read tcp 127.0.0.1:38246->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:41:10 packets.go:37: read tcp 127.0.0.1:38252->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:41:20 packets.go:37: read tcp 127.0.0.1:50096->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:41:40 packets.go:37: read tcp 127.0.0.1:32986->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:42:08 packets.go:37: read tcp 127.0.0.1:59418->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:42:10 packets.go:37: read tcp 127.0.0.1:59422->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:42:20 packets.go:37: read tcp 127.0.0.1:34220->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:42:30 packets.go:37: read tcp 127.0.0.1:60932->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:42:40 packets.go:37: read tcp 127.0.0.1:47034->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:42:50 packets.go:37: read tcp 127.0.0.1:59082->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:43:13 packets.go:37: read tcp 127.0.0.1:45292->127.0.0.1:6001: i/o timeout [mysql] 2024/05/08 15:43:15 packets.go:37: read tcp 127.0.0.1:36676->127.0.0.1:6001: i/o timeout panic: internal error: driver info: retry time out [recovered] panic: internal error: driver info: retry time out
goroutine 66734 [running]: github.com/matrixorigin/matrixone/pkg/vm/engine/tae/logstore/driver/logservicedriver.NewLogServiceDriver.func1({0x3db80a0?, 0xc058ad0010?}) /home/pengzhen/Documents/temp/matrixone-temp/pkg/vm/engine/tae/logstore/driver/logservicedriver/driver.go:89 +0x1d github.com/panjf2000/ants/v2.(goWorker).run.func1.1() /home/pengzhen/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.4/worker.go:54 +0x6d panic({0x3db80a0?, 0xc058ad0010?}) /usr/local/go/src/runtime/panic.go:914 +0x21f go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x0?, 0x77609c0?, {0x0?, 0x0?, 0xc04f728980?}) /home/pengzhen/go/pkg/mod/go.uber.org/zap@v1.24.0/zapcore/entry.go:198 +0x54 go.uber.org/zap/zapcore.(CheckedEntry).Write(0xc046bb1e10, {0x0, 0x0, 0x0}) /home/pengzhen/go/pkg/mod/go.uber.org/zap@v1.24.0/zapcore/entry.go:264 +0x3ec go.uber.org/zap.(Logger).Panic(0xc01bb70720?, {0xc000127680?, 0x0?}, {0x0, 0x0, 0x0}) /home/pengzhen/go/pkg/mod/go.uber.org/zap@v1.24.0/logger.go:258 +0x51 github.com/matrixorigin/matrixone/pkg/logutil.Panic({0xc000127680?, 0x21?}, {0x0?, 0x1?, 0x1?}) /home/pengzhen/Documents/temp/matrixone-temp/pkg/logutil/api.go:41 +0x85 github.com/matrixorigin/matrixone/pkg/vm/engine/tae/logstore/driver/logservicedriver.(driverAppender).append(0xc050a44380, 0xc0082fdfa8?, 0x2540be400) /home/pengzhen/Documents/temp/matrixone-temp/pkg/vm/engine/tae/logstore/driver/logservicedriver/appender.go:104 +0x84f github.com/matrixorigin/matrixone/pkg/vm/engine/tae/logstore/driver/logservicedriver.(LogServiceDriver).onAppendQueue.func1() /home/pengzhen/Documents/temp/matrixone-temp/pkg/vm/engine/tae/logstore/driver/logservicedriver/append.go:67 +0x27 github.com/panjf2000/ants/v2.(goWorker).run.func1() /home/pengzhen/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.4/worker.go:67 +0x8d created by github.com/panjf2000/ants/v2.(*goWorker).run in goroutine 1730 /home/pengzhen/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.4/worker.go:48 +0x5c
tpcc 10仓10并发。大概跑了半个小时。
Is there an existing issue for the same bug?
Environment
Actual Behavior
the panic log : {"level":"ERROR","time":"2023/10/20 01:36:06.449989 +0800","name":"hakeeper-client-backend","caller":"morpc/backend.go:545","msg":"read loop stopped","remote":"127.0.0.1:32001","backend-id":"d52bceb7-0b5e-4328-9d7b-f5079de2ead7"} {"level":"INFO","time":"2023/10/20 01:36:06.442867 +0800","name":"rpc-client[hakeeper-client([connectToHAKeeper])]","caller":"morpc/client.go:343","msg":"gc idle backends task started"} {"level":"INFO","time":"2023/10/20 01:36:06.431010 +0800","caller":"disttae/txn.go:668","msg":"transaction commit: 1cf17d9075444cd6ae64da4386842d1a/Active/S:1697730556552701185-1\n"} {"level":"WARN","time":"2023/10/20 01:36:06.371792 +0800","name":"gossip","caller":"registry/gossip_logger.go:44","msg":"memberlist: Failed to push local state: write tcp 127.0.0.1:32002->127.0.0.1:46712: i/o timeout from=127.0.0.1:46712"} {"level":"WARN","time":"2023/10/20 01:36:06.478544 +0800","name":"dragonboat","caller":"v4@v4.0.0-20230426084722-d189534f8004/node.go:1398","msg":"[00000:31072] had 12 LocalTick msgs in one batch"} panic: internal error: driver info: retry time out [recovered] panic: internal error: driver info: retry time out
goroutine 5103827 [running]: github.com/matrixorigin/matrixone/pkg/vm/engine/tae/logstore/driver/logservicedriver.NewLogServiceDriver.func1({0x31376a0?, 0xc1ec735010?}) /mnt/datadisk0/actions-runner/_work/mo-nightly-regression/mo-nightly-regression/matrixone/pkg/vm/engine/tae/logstore/driver/logservicedriver/driver.go:89 +0x25 github.com/panjf2000/ants/v2.(goWorker).run.func1.1() /home/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.4/worker.go:54 +0x75 panic({0x31376a0, 0xc1ec735010}) /usr/local/go/src/runtime/panic.go:884 +0x213 go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x0?, 0x5899e80?, {0x0?, 0x0?, 0xc0a36e80e0?}) /home/go/pkg/mod/go.uber.org/zap@v1.24.0/zapcore/entry.go:198 +0x65 go.uber.org/zap/zapcore.(CheckedEntry).Write(0xc331846680, {0x0, 0x0, 0x0}) /home/go/pkg/mod/go.uber.org/zap@v1.24.0/zapcore/entry.go:264 +0x3ec go.uber.org/zap.(Logger).Panic(0xc2a48a9400?, {0xc00064cab0?, 0x0?}, {0x0, 0x0, 0x0}) /home/go/pkg/mod/go.uber.org/zap@v1.24.0/logger.go:258 +0x59 github.com/matrixorigin/matrixone/pkg/logutil.Panic({0xc00064cab0?, 0x21?}, {0x0?, 0x1?, 0x1?}) /mnt/datadisk0/actions-runner/_work/mo-nightly-regression/mo-nightly-regression/matrixone/pkg/logutil/api.go:41 +0x8b github.com/matrixorigin/matrixone/pkg/vm/engine/tae/logstore/driver/logservicedriver.(driverAppender).append(0xc183b05e80, 0xc00bd80fa8?, 0x2540be400) /mnt/datadisk0/actions-runner/_work/mo-nightly-regression/mo-nightly-regression/matrixone/pkg/vm/engine/tae/logstore/driver/logservicedriver/appender.go:101 +0x7e5 github.com/matrixorigin/matrixone/pkg/vm/engine/tae/logstore/driver/logservicedriver.(LogServiceDriver).onAppendQueue.func1() /mnt/datadisk0/actions-runner/_work/mo-nightly-regression/mo-nightly-regression/matrixone/pkg/vm/engine/tae/logstore/driver/logservicedriver/append.go:67 +0x2d github.com/panjf2000/ants/v2.(goWorker).run.func1() /home/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.4/worker.go:67 +0x97 created by github.com/panjf2000/ants/v2.(*goWorker).run /home/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.4/worker.go:48 +0x65
the whole log: mo-service-panic.tar.gz
Expected Behavior
No response
Steps to Reproduce
Additional information
No response