tikv / pd

Placement driver for TiKV
Apache License 2.0
1.04k stars 717 forks source link

[PD:server:ErrCancelStartEtcd]etcd start canceled #6169

Open zeminzhou opened 1 year ago

zeminzhou commented 1 year ago

Bug Report

What did you do?

  1. start 3 pd and 1 tikv by tiup
  2. after the cluster starts, sending ctrl-c to tiup
  3. restart cluster
  4. pd reports [PD:server:ErrCancelStartEtcd]etcd start canceled

What did you expect to see?

cluster starts

What did you see instead?

cluster can't starts

What version of PD are you using (pd-server -V)?

nightly

[2023/03/15 16:58:29.022 +08:00] [WARN] [server.go:2098] ["failed to publish local member to cluster through raft"] [local-member-id=3cc48049f4cb1420] [local-member-attributes="{Name:pd-0 ClientURLs:[http://172.16.5.32:8886]}"] [request-path=/0/members/3cc48049f4cb1420/attributes] [publish-timeout=11s] [error="etcdserver: request timed out"] [2023/03/15 16:58:30.004 +08:00] [INFO] [raft.go:929] ["3cc48049f4cb1420 is starting a new election at term 2"] [2023/03/15 16:58:30.004 +08:00] [INFO] [raft.go:735] ["3cc48049f4cb1420 became pre-candidate at term 2"] [2023/03/15 16:58:30.004 +08:00] [INFO] [raft.go:830] ["3cc48049f4cb1420 received MsgPreVoteResp from 3cc48049f4cb1420 at term 2"] [2023/03/15 16:58:30.004 +08:00] [INFO] [raft.go:817] ["3cc48049f4cb1420 [logterm: 2, index: 71] sent MsgPreVote request to bfc89e1e0ddd90cb at term 2"] [2023/03/15 16:58:30.004 +08:00] [INFO] [raft.go:817] ["3cc48049f4cb1420 [logterm: 2, index: 71] sent MsgPreVote request to eea148a6a9b43fea at term 2"] [2023/03/15 16:58:33.045 +08:00] [WARN] [probing_status.go:70] ["prober detected unhealthy status"] [round-tripper-name=ROUND_TRIPPER_RAFT_MESSAGE] [remote-peer-id=bfc89e1e0ddd90cb] [rtt=0s] [error="dial tcp 172.16.5.32:35640: connect: connection refused"] [2023/03/15 16:58:33.045 +08:00] [WARN] [probing_status.go:70] ["prober detected unhealthy status"] [round-tripper-name=ROUND_TRIPPER_SNAPSHOT] [remote-peer-id=bfc89e1e0ddd90cb] [rtt=0s] [error="dial tcp 172.16.5.32:35640: connect: connection refused"] [2023/03/15 16:58:33.045 +08:00] [WARN] [probing_status.go:70] ["prober detected unhealthy status"] [round-tripper-name=ROUND_TRIPPER_SNAPSHOT] [remote-peer-id=eea148a6a9b43fea] [rtt=0s] [error="dial tcp 172.16.5.32:42675: connect: connection refused"] [2023/03/15 16:58:33.045 +08:00] [WARN] [probing_status.go:70] ["prober detected unhealthy status"] [round-tripper-name=ROUND_TRIPPER_RAFT_MESSAGE] [remote-peer-id=eea148a6a9b43fea] [rtt=0s] [error="dial tcp 172.16.5.32:42675: connect: connection refused"] [2023/03/15 16:58:33.504 +08:00] [INFO] [raft.go:929] ["3cc48049f4cb1420 is starting a new election at term 2"] [2023/03/15 16:58:33.504 +08:00] [INFO] [raft.go:735] ["3cc48049f4cb1420 became pre-candidate at term 2"] [2023/03/15 16:58:33.504 +08:00] [INFO] [raft.go:830] ["3cc48049f4cb1420 received MsgPreVoteResp from 3cc48049f4cb1420 at term 2"] [2023/03/15 16:58:33.504 +08:00] [INFO] [raft.go:817] ["3cc48049f4cb1420 [logterm: 2, index: 71] sent MsgPreVote request to bfc89e1e0ddd90cb at term 2"] [2023/03/15 16:58:33.504 +08:00] [INFO] [raft.go:817] ["3cc48049f4cb1420 [logterm: 2, index: 71] sent MsgPreVote request to eea148a6a9b43fea at term 2"] [2023/03/15 16:58:37.004 +08:00] [INFO] [raft.go:929] ["3cc48049f4cb1420 is starting a new election at term 2"] [2023/03/15 16:58:37.004 +08:00] [INFO] [raft.go:735] ["3cc48049f4cb1420 became pre-candidate at term 2"] [2023/03/15 16:58:37.004 +08:00] [INFO] [raft.go:830] ["3cc48049f4cb1420 received MsgPreVoteResp from 3cc48049f4cb1420 at term 2"] [2023/03/15 16:58:37.004 +08:00] [INFO] [raft.go:817] ["3cc48049f4cb1420 [logterm: 2, index: 71] sent MsgPreVote request to bfc89e1e0ddd90cb at term 2"] [2023/03/15 16:58:37.004 +08:00] [INFO] [raft.go:817] ["3cc48049f4cb1420 [logterm: 2, index: 71] sent MsgPreVote request to eea148a6a9b43fea at term 2"] [2023/03/15 16:58:38.046 +08:00] [WARN] [probing_status.go:70] ["prober detected unhealthy status"] [round-tripper-name=ROUND_TRIPPER_RAFT_MESSAGE] [remote-peer-id=eea148a6a9b43fea] [rtt=0s] [error="dial tcp 172.16.5.32:42675: connect: connection refused"] [2023/03/15 16:58:38.046 +08:00] [WARN] [probing_status.go:70] ["prober detected unhealthy status"] [round-tripper-name=ROUND_TRIPPER_RAFT_MESSAGE] [remote-peer-id=bfc89e1e0ddd90cb] [rtt=0s] [error="dial tcp 172.16.5.32:35640: connect: connection refused"] [2023/03/15 16:58:38.046 +08:00] [WARN] [probing_status.go:70] ["prober detected unhealthy status"] [round-tripper-name=ROUND_TRIPPER_SNAPSHOT] [remote-peer-id=eea148a6a9b43fea] [rtt=0s] [error="dial tcp 172.16.5.32:42675: connect: connection refused"] [2023/03/15 16:58:38.046 +08:00] [WARN] [probing_status.go:70] ["prober detected unhealthy status"] [round-tripper-name=ROUND_TRIPPER_SNAPSHOT] [remote-peer-id=bfc89e1e0ddd90cb] [rtt=0s] [error="dial tcp 172.16.5.32:35640: connect: connection refused"] [2023/03/15 16:58:40.023 +08:00] [WARN] [server.go:2098] ["failed to publish local member to cluster through raft"] [local-member-id=3cc48049f4cb1420] [local-member-attributes="{Name:pd-0 ClientURLs:[http://172.16.5.32:8886]}"] [request-path=/0/members/3cc48049f4cb1420/attributes] [publish-timeout=11s] [error="etcdserver: request timed out"] [2023/03/15 16:58:40.503 +08:00] [INFO] [raft.go:929] ["3cc48049f4cb1420 is starting a new election at term 2"] [2023/03/15 16:58:40.504 +08:00] [INFO] [raft.go:735] ["3cc48049f4cb1420 became pre-candidate at term 2"] [2023/03/15 16:58:40.504 +08:00] [INFO] [raft.go:830] ["3cc48049f4cb1420 received MsgPreVoteResp from 3cc48049f4cb1420 at term 2"] [2023/03/15 16:58:40.504 +08:00] [INFO] [raft.go:817] ["3cc48049f4cb1420 [logterm: 2, index: 71] sent MsgPreVote request to bfc89e1e0ddd90cb at term 2"] [2023/03/15 16:58:40.504 +08:00] [INFO] [raft.go:817] ["3cc48049f4cb1420 [logterm: 2, index: 71] sent MsgPreVote request to eea148a6a9b43fea at term 2"] [2023/03/15 16:58:43.001 +08:00] [FATAL] [main.go:230] ["run server failed"] [error="[PD:server:ErrCancelStartEtcd]etcd start canceled"] [stack="main.start\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:230\nmain.createServerWrapper\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:145\ngithub.com/spf13/cobra.(Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846\ngithub.com/spf13/cobra.(Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:56\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"]

nolouch commented 1 year ago

@zeminzhou Does it continue to fail to start? it seem there has other members and not start up.


[2023/03/15 16:58:38.046 +08:00] [WARN] [probing_status.go:70] ["prober detected unhealthy status"] [round-tripper-name=ROUND_TRIPPER_SNAPSHOT] [remote-peer-id=bfc89e1e0ddd90cb] [rtt=0s] [error="dial tcp 172.16.5.32:35640: connect: connection refused"]
zeminzhou commented 1 year ago

@zeminzhou Does it continue to fail to start? it seem there has other members and not start up.


[2023/03/15 16:58:38.046 +08:00] [WARN] [probing_status.go:70] ["prober detected unhealthy status"] [round-tripper-name=ROUND_TRIPPER_SNAPSHOT] [remote-peer-id=bfc89e1e0ddd90cb] [rtt=0s] [error="dial tcp 172.16.5.32:35640: connect: connection refused"]

Yesh, It continues to fail to start. Can reproduce this situation?

rleungx commented 1 year ago

Is there any progress for this issue?