Closed hulucc closed 1 year ago
I believe this is my misusage, but I'm not sure how to solve it. I have three nodehost and one shard. After reboot, program will call StartCluster to recovery the shard, then panic. Any idea will be pleasure.
StartCluster
shard config
return config.Config{ NodeID: nodeID, ClusterID: shardID, CheckQuorum: true, ElectionRTT: 10, HeartbeatRTT: 2, SnapshotEntries: 100, CompactionOverhead: 0, OrderedConfigChange: false, }
v3.3.5
no panics
randomly panic after restart
[INFO] DataVersion: 60abf29917f0816141585d494a856b8aae121c1cd26aa1c81b5ec55147c85f0e [INFO] CodeVersion: 60abf29917f0816141585d494a856b8aae121c1cd26aa1c81b5ec55147c85f0e 2022-11-28 06:29:13.197228 I | dragonboat: go version: go1.19, linux/amd64 2022-11-28 06:29:13.197348 I | dragonboat: dragonboat version: 3.3.5 (Rel) 2022-11-28 06:29:13.197400 I | config: using default EngineConfig 2022-11-28 06:29:13.197571 I | config: using default LogDBConfig 2022-11-28 06:29:13.197770 I | dragonboat: DeploymentID set to 1 2022-11-28 06:29:13.207918 I | dragonboat: LogDB info received, shard 0, busy false 2022-11-28 06:29:13.218901 I | dragonboat: LogDB info received, shard 1, busy false 2022-11-28 06:29:13.228163 I | dragonboat: LogDB info received, shard 2, busy false 2022-11-28 06:29:13.237630 I | dragonboat: LogDB info received, shard 3, busy false 2022-11-28 06:29:13.246292 I | dragonboat: LogDB info received, shard 4, busy false 2022-11-28 06:29:13.254084 I | dragonboat: LogDB info received, shard 5, busy false 2022-11-28 06:29:13.268796 I | dragonboat: LogDB info received, shard 6, busy false 2022-11-28 06:29:13.279942 I | dragonboat: LogDB info received, shard 7, busy false 2022-11-28 06:29:13.288599 I | dragonboat: LogDB info received, shard 8, busy false 2022-11-28 06:29:13.301422 I | dragonboat: LogDB info received, shard 9, busy false 2022-11-28 06:29:13.311992 I | dragonboat: LogDB info received, shard 10, busy false 2022-11-28 06:29:13.326616 I | dragonboat: LogDB info received, shard 11, busy false 2022-11-28 06:29:13.344879 I | dragonboat: LogDB info received, shard 12, busy false 2022-11-28 06:29:13.363999 I | dragonboat: LogDB info received, shard 13, busy false 2022-11-28 06:29:13.374470 I | dragonboat: LogDB info received, shard 14, busy false 2022-11-28 06:29:13.385537 I | logdb: using plain logdb 2022-11-28 06:29:13.385605 I | dragonboat: LogDB info received, shard 15, busy false 2022-11-28 06:29:13.386022 I | dragonboat: logdb memory limit: 8192 MBytes 2022-11-28 06:29:13.386047 I | dragonboat: NodeHost ID: nhid-14008783967022962185 2022-11-28 06:29:13.386051 I | dragonboat: using regular node registry 2022-11-28 06:29:13.386058 I | dragonboat: filesystem error injection mode enabled: false 2022-11-28 06:29:13.386509 I | transport: transport type: go-tcp-transport 2022-11-28 06:29:13.388798 I | dragonboat: transport type: go-tcp-transport 2022-11-28 06:29:13.388811 I | dragonboat: logdb type: sharded-pebble 2022-11-28 06:29:13.388816 I | dragonboat: nodehost address: moxa-2.moxa-headless.temp.svc.cluster.local:63000 [INFO] CodeVersion match with DataVersion, skip migration [INFO] shardmanager: decided to recovery shard 0 2022-11-28 06:29:13.394448 I | dragonboat: [00000:62185] replaying raft logs 2022-11-28 06:29:13.395031 I | dragonboat: [00000:62185] has logdb entries size 0 commit 27 term 75 2022-11-28 06:29:13.395161 I | raft: [00000:62185] created, initial: false, new: false 2022-11-28 06:29:13.395247 W | config: ElectionRTT is not a magnitude larger than HeartbeatRTT 2022-11-28 06:29:13.395346 I | raft: [00000:62185] raft log rate limit enabled: false, 0 2022-11-28 06:29:13.395445 I | raft: [f:28,l:27,t:74,c:27,a:27] [00000:62185] t75 became follower 2022-11-28 06:29:13.402036 I | dragonboat: [00000:62185] recovered from <00000:62185:27> 2022-11-28 06:29:13.402199 I | dragonboat: [00000:62185] initialized using <00000:62185:27> 2022-11-28 06:29:13.402214 I | dragonboat: [00000:62185] initial index set to 27 2022-11-28 06:29:14.397591 W | dragonboat: StaleRead called, linearizability not guaranteed for stale read [INFO] OnSubShardsUpdating shards version 0 -> 2 [INFO] ShardSpecChangingWorker 0 shard changes detected 2022-11-28 06:29:14.511861 W | raft: [f:28,l:27,t:74,c:27,a:27] [00000:62185] t75 received Heartbeat with higher term (76) from n97714 2022-11-28 06:29:14.511985 W | raft: [f:28,l:27,t:74,c:27,a:27] [00000:62185] t75 become follower after receiving higher term from n97714 2022-11-28 06:29:14.512013 I | raft: [f:28,l:27,t:74,c:27,a:27] [00000:62185] t76 became follower 2022-11-28 06:29:14.512020 C | raft: invalid commitTo index 28, lastIndex() 27 panic: invalid commitTo index 28, lastIndex() 27 goroutine 730 [running]: github.com/lni/goutils/logutil/capnslog.(*PackageLogger).Panicf(0xc000204000?, {0xe1edcf?, 0xc00019e880?}, {0xc0010f6ba0?, 0xc00021a8d0?, 0xc000731e30?}) github.com/lni/goutils@v1.3.0/logutil/capnslog/pkg_logger.go:88 +0xbb github.com/lni/dragonboat/v3/logger.(*capnsLog).Panicf(0xc00021a8d0?, {0xe1edcf?, 0x40d947?}, {0xc0010f6ba0?, 0xca5220?, 0x1?}) github.com/lni/dragonboat/v3@v3.3.5/logger/capnslogger.go:74 +0x26 github.com/lni/dragonboat/v3/logger.(*dragonboatLogger).Panicf(0xc000192410?, {0xe1edcf, 0x29}, {0xc0010f6ba0, 0x2, 0x2}) github.com/lni/dragonboat/v3@v3.3.5/logger/logger.go:132 +0x57 github.com/lni/dragonboat/v3/internal/raft.(*entryLog).commitTo(0xc00036d2d0, 0x1c) github.com/lni/dragonboat/v3@v3.3.5/internal/raft/logentry.go:328 +0x102 github.com/lni/dragonboat/v3/internal/raft.(*raft).handleHeartbeatMessage(_, {0x11, 0xc26932cbd9975609, 0xfe93de3932274132, 0x0, 0x4c, 0x0, 0x0, 0x1c, 0x0, ...}) github.com/lni/dragonboat/v3@v3.3.5/internal/raft/raft.go:1317 +0x45 github.com/lni/dragonboat/v3/internal/raft.(*raft).handleFollowerHeartbeat(_, {0x11, 0xc26932cbd9975609, 0xfe93de3932274132, 0x0, 0x4c, 0x0, 0x0, 0x1c, 0x0, ...}) github.com/lni/dragonboat/v3@v3.3.5/internal/raft/raft.go:1933 +0x85 github.com/lni/dragonboat/v3/internal/raft.defaultHandle(_, {0x11, 0xc26932cbd9975609, 0xfe93de3932274132, 0x0, 0x4c, 0x0, 0x0, 0x1c, 0x0, ...}) github.com/lni/dragonboat/v3@v3.3.5/internal/raft/raft.go:2098 +0x95 github.com/lni/dragonboat/v3/internal/raft.(*raft).Handle(_, {0x11, 0xc26932cbd9975609, 0xfe93de3932274132, 0x0, 0x4c, 0x0, 0x0, 0x1c, 0x0, ...}) github.com/lni/dragonboat/v3@v3.3.5/internal/raft/raft.go:1483 +0x27f github.com/lni/dragonboat/v3/internal/raft.(*Peer).Handle(_, {0x11, 0xc26932cbd9975609, 0xfe93de3932274132, 0x0, 0x4c, 0x0, 0x0, 0x1c, 0x0, ...}) github.com/lni/dragonboat/v3@v3.3.5/internal/raft/peer.go:195 +0x185 github.com/lni/dragonboat/v3.(*node).handleReceivedMessages(0xc000310200) github.com/lni/dragonboat/v3@v3.3.5/node.go:1275 +0x358 github.com/lni/dragonboat/v3.(*node).handleEvents(0xc000310200) github.com/lni/dragonboat/v3@v3.3.5/node.go:1133 +0x73 github.com/lni/dragonboat/v3.(*node).stepNode(_) github.com/lni/dragonboat/v3@v3.3.5/node.go:1111 +0x150 github.com/lni/dragonboat/v3.(*engine).processSteps(0xc000329360, 0xc000733da8?, 0xc000733e38?, 0xc0012c5920, {0x1533328, 0x1?, 0x0}, 0xc0000d9bc0?) github.com/lni/dragonboat/v3@v3.3.5/engine.go:1279 +0x265 github.com/lni/dragonboat/v3.(*engine).stepWorkerMain(0xc000329360, 0x1) github.com/lni/dragonboat/v3@v3.3.5/engine.go:1215 +0x2be github.com/lni/dragonboat/v3.newExecEngine.func1() github.com/lni/dragonboat/v3@v3.3.5/engine.go:1017 +0x68 github.com/lni/goutils/syncutil.(*Stopper).runWorker.func1() github.com/lni/goutils@v1.3.0/syncutil/stopper.go:80 +0xc5 created by github.com/lni/goutils/syncutil.(*Stopper).runWorker github.com/lni/goutils@v1.3.0/syncutil/stopper.go:75 +0xea
restart over and over
Wrong use of tools.ImportSnapshot, never mind.
I believe this is my misusage, but I'm not sure how to solve it. I have three nodehost and one shard. After reboot, program will call
StartCluster
to recovery the shard, then panic. Any idea will be pleasure.shard config
Dragonboat version
v3.3.5
Expected behavior
no panics
Actual behavior
randomly panic after restart
Steps to reproduce the behavior
restart over and over