etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.83k stars 9.77k forks source link

etcd3.5.0: panic: tocommit(458) is out of range [lastIndex(3)]. Was the raft log corrupted, truncated, or lost? #13509

Closed redriverhong closed 2 years ago

redriverhong commented 2 years ago

I got a panic with version 3.5.0 when I stop a member and remove data start. panic: tocommit(458) is out of range [lastIndex(3)]. Was the raft log corrupted, truncated, or lost?

goroutine 163 [running]: go.uber.org/zap/zapcore.(CheckedEntry).Write(0xc0001663c0, 0x0, 0x0, 0x0) /opt/buildtools/go_workpace/pkg/mod/go.uber.org/zap@v1.17.0/zapcore/entry.go:234 +0x58d go.uber.org/zap.(SugaredLogger).log(0xc0004981e0, 0xc0006ba104, 0x55a3ccd20fe1, 0x5d, 0xc00007c4c0, 0x2, 0x2, 0x0, 0x0, 0x0) /opt/buildtools/go_workpace/pkg/mod/go.uber.org/zap@v1.17.0/sugar.go:227 +0x115 go.uber.org/zap.(SugaredLogger).Panicf(...) /opt/buildtools/go_workpace/pkg/mod/go.uber.org/zap@v1.17.0/sugar.go:159 go.etcd.io/etcd/server/v3/etcdserver.(zapRaftLogger).Panicf(0xc000682b80, 0x55a3ccd20fe1, 0x5d, 0xc00007c4c0, 0x2, 0x2) /usr1/3.5.0/server/etcdserver/zap_raft.go:101 +0x7f go.etcd.io/etcd/raft/v3.(raftLog).commitTo(0xc0006e4310, 0x1ca) /usr1/3.5.0/raft/log.go:237 +0x135 go.etcd.io/etcd/raft/v3.(raft).handleHeartbeat(0xc0004bcf20, 0x8, 0x5e92d99e003cce4, 0x507df051d12df981, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /usr1/3.5.0/raft/raft.go:1513 +0x56 go.etcd.io/etcd/raft/v3.stepFollower(0xc0004bcf20, 0x8, 0x5e92d99e003cce4, 0x507df051d12df981, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /usr1/3.5.0/raft/raft.go:1439 +0x498 go.etcd.io/etcd/raft/v3.(raft).Step(0xc0004bcf20, 0x8, 0x5e92d99e003cce4, 0x507df051d12df981, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /usr1/3.5.0/raft/raft.go:980 +0xa55 go.etcd.io/etcd/raft/v3.(node).run(0xc00069cf00) /usr1/3.5.0/raft/node.go:356 +0x798 created by go.etcd.io/etcd/raft/v3.RestartNode /usr1/3.5.0/raft/node.go:244 +0x330

Reproduce Procedure:

  1. Start etcd on the three nodes in static specified mode. Key Configuration Items: image
  2. stop a member, and remove Data Directory.
  3. start this member and got a panic.

I think this is a normal operation for a three-node cluster. In 3.4.x and earlier versions, this method is often used to recover a node in a cluster that has corrupted data. However, version 3.5.0 does not apply.

ahrtr commented 2 years ago

It's expected behavior. If an etcd node crashes and the local data is completely gone, then operator needs to remove the member from the cluster, and then add the member back again. The commands are roughly like below

etcdctl member remove <memberId>
etcdctl member add <name> --peer-urls=https:/x.x.x.x:2380

At last, start the member again, note you need to set the --initial-cluster-state as "existing" in this case. Note that it makes sense to require the operator/human intervention in such extreme scenario.

If a member crashes but the local data is still there, then it should be OK to start the member again directly.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

zhangguanzhang commented 2 years ago
etcdctl member remove <memberId>
etcdctl member add <name> --peer-urls=https://x.x.x.x:2380
M1178475702 commented 1 year ago

I noticed that a panic can occur when a crashed node restarts and receives a heartbeat message from the leader despite having lost the Raft log. While I understand that a new entry cannot be committed if the last committed log index (lastindex) is less than the log to be committed (toCommit), I am unclear as to why the node should panic and exit the system instead of rejecting the heartbeat message and waiting for further heartbeat messages with lower index logs

M1178475702 commented 1 year ago

And there is a further problem that if leader has committed some logs during the the node crashd, then the restart of the node will always be failed no matter if the logs are lost or not.

pzhp commented 1 month ago

@M1178475702 ,

Do you have idea to cope with this problem?
I agreed with your idea that this follower should wait for entries reaching the tocommit index instead of exit directly.

Steps to reproduce:

1) three etcd nodes run normally. Backup one node's data folder (e.g. node1) by "cp -rf" 2) kill node1's etcd process 3) write some data to etcd cluster to advance the index 4) rm node1's data folder and copy the backup data back 5) start node1's etcd process. At this point, the process can be launched succesfully. 6) kill node1's process 7) rm node1's data folder and copy the backup data back again 8) At this point, etcd process was launched with failure Does peer ectd remember the commited index in remote etcd?

  1 panic: tocommit(216) is out of range [lastIndex(114)]. Was the raft log corrupted, truncated, or lost?
  2 
  3 goroutine 185 [running]:
  4 go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0006280c0, {0x0, 0x0, 0x0})
  5     go.uber.org/zap@v1.17.0/zapcore/entry.go:234 +0x49b
  6 go.uber.org/zap.(*SugaredLogger).log(0xc000012570, 0x4, {0x11944b4?, 0x40df67?}, {0xc001620140?, 0xffffffffffffffff?, 0x2c?}, {0x0, 0x0, 0x0})
  7     go.uber.org/zap@v1.17.0/sugar.go:227 +0xee
  8 go.uber.org/zap.(*SugaredLogger).Panicf(...)
  9     go.uber.org/zap@v1.17.0/sugar.go:159
 10 go.etcd.io/etcd/server/v3/etcdserver.(*zapRaftLogger).Panicf(0xc0006ce800?, {0x11944b4?, 0xc0001f6000?}, {0xc001620140?, 0x563392?, 0x4783be?})
 11     go.etcd.io/etcd/server/v3/etcdserver/zap_raft.go:101 +0x49
 12 go.etcd.io/etcd/raft/v3.(*raftLog).commitTo(0xc0002fc310, 0xd8)
 13     go.etcd.io/etcd/raft/v3@v3.5.11/log.go:237 +0x103
 14 go.etcd.io/etcd/raft/v3.(*raft).handleHeartbeat(_, {0x8, 0x9602ee38cb0e22c4, 0x406775421168bf2e, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
 15     go.etcd.io/etcd/raft/v3@v3.5.11/raft.go:1508 +0x45
 16 go.etcd.io/etcd/raft/v3.stepFollower(_, {0x8, 0x9602ee38cb0e22c4, 0x406775421168bf2e, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
 17     go.etcd.io/etcd/raft/v3@v3.5.11/raft.go:1434 +0x3d8
 18 go.etcd.io/etcd/raft/v3.(*raft).Step(_, {0x8, 0x9602ee38cb0e22c4, 0x406775421168bf2e, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
 19     go.etcd.io/etcd/raft/v3@v3.5.11/raft.go:975 +0x1335
 20 go.etcd.io/etcd/raft/v3.(*node).run(0xc0005a4c00)
 21     go.etcd.io/etcd/raft/v3@v3.5.11/node.go:356 +0x925
 22 created by go.etcd.io/etcd/raft/v3.RestartNode
 23     go.etcd.io/etcd/raft/v3@v3.5.11/node.go:244 +0x24a
zhangguanzhang commented 1 month ago

@M1178475702 ,

Do you have idea to cope with this problem? I agreed with your idea that this follower should wait for entries reaching the tocommit index instead of exit directly.

Steps to reproduce:

  1. three etcd nodes run normally. Backup one node's data folder (e.g. node1) by "cp -rf"
  2. kill node1's etcd process
  3. write some data to etcd cluster to advance the index
  4. rm node1's data folder and copy the backup data back
  5. start node1's etcd process. At this point, the process can be launched succesfully. **6) kill node1's process
  6. rm node1's data folder and copy the backup data back again
  7. At this point, etcd process was launched with failure** Does peer ectd remember the commited index in remote etcd?
  1 panic: tocommit(216) is out of range [lastIndex(114)]. Was the raft log corrupted, truncated, or lost?
  2 
  3 goroutine 185 [running]:
  4 go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0006280c0, {0x0, 0x0, 0x0})
  5     go.uber.org/zap@v1.17.0/zapcore/entry.go:234 +0x49b
  6 go.uber.org/zap.(*SugaredLogger).log(0xc000012570, 0x4, {0x11944b4?, 0x40df67?}, {0xc001620140?, 0xffffffffffffffff?, 0x2c?}, {0x0, 0x0, 0x0})
  7     go.uber.org/zap@v1.17.0/sugar.go:227 +0xee
  8 go.uber.org/zap.(*SugaredLogger).Panicf(...)
  9     go.uber.org/zap@v1.17.0/sugar.go:159
 10 go.etcd.io/etcd/server/v3/etcdserver.(*zapRaftLogger).Panicf(0xc0006ce800?, {0x11944b4?, 0xc0001f6000?}, {0xc001620140?, 0x563392?, 0x4783be?})
 11     go.etcd.io/etcd/server/v3/etcdserver/zap_raft.go:101 +0x49
 12 go.etcd.io/etcd/raft/v3.(*raftLog).commitTo(0xc0002fc310, 0xd8)
 13     go.etcd.io/etcd/raft/v3@v3.5.11/log.go:237 +0x103
 14 go.etcd.io/etcd/raft/v3.(*raft).handleHeartbeat(_, {0x8, 0x9602ee38cb0e22c4, 0x406775421168bf2e, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
 15     go.etcd.io/etcd/raft/v3@v3.5.11/raft.go:1508 +0x45
 16 go.etcd.io/etcd/raft/v3.stepFollower(_, {0x8, 0x9602ee38cb0e22c4, 0x406775421168bf2e, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
 17     go.etcd.io/etcd/raft/v3@v3.5.11/raft.go:1434 +0x3d8
 18 go.etcd.io/etcd/raft/v3.(*raft).Step(_, {0x8, 0x9602ee38cb0e22c4, 0x406775421168bf2e, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
 19     go.etcd.io/etcd/raft/v3@v3.5.11/raft.go:975 +0x1335
 20 go.etcd.io/etcd/raft/v3.(*node).run(0xc0005a4c00)
 21     go.etcd.io/etcd/raft/v3@v3.5.11/node.go:356 +0x925
 22 created by go.etcd.io/etcd/raft/v3.RestartNode
 23     go.etcd.io/etcd/raft/v3@v3.5.11/node.go:244 +0x24a

数据损坏了,单节点就g了,多节点其他节点正常下可以恢复

pzhp commented 1 month ago

@M1178475702 , Do you have idea to cope with this problem? I agreed with your idea that this follower should wait for entries reaching the tocommit index instead of exit directly. Steps to reproduce:

  1. three etcd nodes run normally. Backup one node's data folder (e.g. node1) by "cp -rf"
  2. kill node1's etcd process
  3. write some data to etcd cluster to advance the index
  4. rm node1's data folder and copy the backup data back
  5. start node1's etcd process. At this point, the process can be launched succesfully. **6) kill node1's process
  6. rm node1's data folder and copy the backup data back again
  7. At this point, etcd process was launched with failure** Does peer ectd remember the commited index in remote etcd?
  1 panic: tocommit(216) is out of range [lastIndex(114)]. Was the raft log corrupted, truncated, or lost?
  2 
  3 goroutine 185 [running]:
  4 go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0006280c0, {0x0, 0x0, 0x0})
  5     go.uber.org/zap@v1.17.0/zapcore/entry.go:234 +0x49b
  6 go.uber.org/zap.(*SugaredLogger).log(0xc000012570, 0x4, {0x11944b4?, 0x40df67?}, {0xc001620140?, 0xffffffffffffffff?, 0x2c?}, {0x0, 0x0, 0x0})
  7     go.uber.org/zap@v1.17.0/sugar.go:227 +0xee
  8 go.uber.org/zap.(*SugaredLogger).Panicf(...)
  9     go.uber.org/zap@v1.17.0/sugar.go:159
 10 go.etcd.io/etcd/server/v3/etcdserver.(*zapRaftLogger).Panicf(0xc0006ce800?, {0x11944b4?, 0xc0001f6000?}, {0xc001620140?, 0x563392?, 0x4783be?})
 11     go.etcd.io/etcd/server/v3/etcdserver/zap_raft.go:101 +0x49
 12 go.etcd.io/etcd/raft/v3.(*raftLog).commitTo(0xc0002fc310, 0xd8)
 13     go.etcd.io/etcd/raft/v3@v3.5.11/log.go:237 +0x103
 14 go.etcd.io/etcd/raft/v3.(*raft).handleHeartbeat(_, {0x8, 0x9602ee38cb0e22c4, 0x406775421168bf2e, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
 15     go.etcd.io/etcd/raft/v3@v3.5.11/raft.go:1508 +0x45
 16 go.etcd.io/etcd/raft/v3.stepFollower(_, {0x8, 0x9602ee38cb0e22c4, 0x406775421168bf2e, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
 17     go.etcd.io/etcd/raft/v3@v3.5.11/raft.go:1434 +0x3d8
 18 go.etcd.io/etcd/raft/v3.(*raft).Step(_, {0x8, 0x9602ee38cb0e22c4, 0x406775421168bf2e, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
 19     go.etcd.io/etcd/raft/v3@v3.5.11/raft.go:975 +0x1335
 20 go.etcd.io/etcd/raft/v3.(*node).run(0xc0005a4c00)
 21     go.etcd.io/etcd/raft/v3@v3.5.11/node.go:356 +0x925
 22 created by go.etcd.io/etcd/raft/v3.RestartNode
 23     go.etcd.io/etcd/raft/v3@v3.5.11/node.go:244 +0x24a

数据损坏了,单节点就g了,多节点其他节点正常下可以恢复 Why does the data corrupt?
For first time, I can start etcd process with backup data folder. The backup is done by cmd "cp" For the second time, ectd process faile with above error message with the same backup data folder.

The background for why do such thing. As the IO bottleneck, I try to use tmpfs as data folder. And need to backup the data when the node restart. But the official snapshot will lose membership information. I have to backup the data folder with "cp".