openark / orchestrator

MySQL replication topology management and HA
Apache License 2.0
5.64k stars 931 forks source link

The leader cannot be elected at the first startup. Election timeout reached, restarting election #1432

Open njuptlzf opened 2 years ago

njuptlzf commented 2 years ago

It is the first time to use the operator in k3s to build a 3-node high-availability Mysql cluster, but the raft of the operator has never been able to select the leader.

orchestrator0

2022/03/13 11:02:07 [INFO] raft: Node at 10.227.26.201:10008 [Follower] entering Follower state (Leader: "")
2022/03/13 11:02:09 [WARN] raft: Heartbeat timeout from "" reached, starting election
2022/03/13 11:02:09 [INFO] raft: Node at 10.227.26.201:10008 [Candidate] entering Candidate state
2022/03/13 11:02:09 [ERR] raft: Failed to make RequestVote RPC to 10.227.59.84:10008: dial tcp 10.227.59.84:10008: connect: connection refused
2022/03/13 11:02:09 [DEBUG] raft: Votes needed: 2
2022/03/13 11:02:09 [DEBUG] raft: Vote granted from 10.227.26.201:10008. Tally: 1
2022/03/13 11:02:10 [ERR] raft: Failed to make RequestVote RPC to 10.227.11.20:10008: dial tcp 10.227.11.20:10008: connect: connection refused
2022/03/13 11:02:10 [WARN] raft: Election timeout reached, restarting election

... ...

2022/03/13 11:06:16 [INFO] raft: Node at 10.227.26.201:10008 [Candidate] entering Candidate state
2022/03/13 11:06:16 [DEBUG] raft: Votes needed: 2
2022/03/13 11:06:16 [DEBUG] raft: Vote granted from 10.227.26.201:10008. Tally: 1
2022/03/13 11:06:17 [ERR] raft: Failed to make RequestVote RPC to 10.227.59.84:10008: dial tcp 10.227.59.84:10008: i/o timeout
2022/03/13 11:06:17 [ERR] raft: Failed to make RequestVote RPC to 10.227.11.20:10008: dial tcp 10.227.11.20:10008: i/o timeout
2022/03/13 11:06:17 [ERR] raft: Failed to make RequestVote RPC to 10.227.59.84:10008: dial tcp 10.227.59.84:10008: connect: connection refused
[martini] Started GET /api/cluster/mysql-cluster.lzf for 127.0.0.1:56170
2022-03-13 11:06:17 ERROR Unable to determine cluster name. clusterHint=mysql-cluster.lzf
[martini] Completed 500 Internal Server Error in 10.588676ms
[martini] Started GET /api/discover/mysql-cluster-mysql-0.mysql.lzf/3306 for 127.0.0.1:56170
[martini] Completed 500 Internal Server Error in 1.002687ms
[martini] Started GET /api/audit-recovery/mysql-cluster.lzf for 127.0.0.1:56170
[martini] Completed 200 OK in 2.371732ms
2022/03/13 11:06:17 [WARN] raft: Election timeout reached, restarting election

orchestrator1

2022/03/13 11:02:07 [INFO] raft: Node at 10.227.59.84:10008 [Follower] entering Follower state (Leader: "")
2022/03/13 11:02:09 [WARN] raft: Heartbeat timeout from "" reached, starting election
2022/03/13 11:02:09 [INFO] raft: Node at 10.227.59.84:10008 [Candidate] entering Candidate state
2022/03/13 11:02:09 [DEBUG] raft: Votes needed: 2
2022/03/13 11:02:09 [DEBUG] raft: Vote granted from 10.227.59.84:10008. Tally: 1
2022/03/13 11:02:10 [ERR] raft: Failed to make RequestVote RPC to 10.227.26.201:10008: dial tcp 10.227.26.201:10008: connect: connection refused
2022/03/13 11:02:10 [ERR] raft: Failed to make RequestVote RPC to 10.227.11.20:10008: dial tcp 10.227.11.20:10008: connect: connection refused
2022/03/13 11:02:10 [WARN] raft: Election timeout reached, restarting election

... ...

2022/03/13 11:06:08 [INFO] raft: Node at 10.227.59.84:10008 [Candidate] entering Candidate state
2022/03/13 11:06:08 [DEBUG] raft: Votes needed: 2
2022/03/13 11:06:08 [DEBUG] raft: Vote granted from 10.227.59.84:10008. Tally: 1
[martini] Started GET /api/raft-health for 192.168.56.124:58012
[martini] Completed 500 Internal Server Error in 845.658µs
[martini] Started GET /api/lb-check for 192.168.56.124:58014
[martini] Completed 200 OK in 641.39µs
2022/03/13 11:06:09 [ERR] raft: Failed to make RequestVote RPC to 10.227.26.201:10008: dial tcp 10.227.26.201:10008: connect: connection refused
2022/03/13 11:06:10 [WARN] raft: Election timeout reached, restarting election

orchestrator2:

2022/03/13 11:02:06 [INFO] raft: Node at 10.227.11.20:10008 [Follower] entering Follower state (Leader: "")
2022/03/13 11:02:07 [WARN] raft: Heartbeat timeout from "" reached, starting election
2022/03/13 11:02:07 [INFO] raft: Node at 10.227.11.20:10008 [Candidate] entering Candidate state
2022/03/13 11:02:08 [DEBUG] raft: Votes needed: 2
2022/03/13 11:02:08 [DEBUG] raft: Vote granted from 10.227.11.20:10008. Tally: 1
2022/03/13 11:02:09 [ERR] raft: Failed to make RequestVote RPC to 10.227.59.84:10008: dial tcp 10.227.59.84:10008: connect: connection refused
2022/03/13 11:02:09 [ERR] raft: Failed to make RequestVote RPC to 10.227.26.201:10008: dial tcp 10.227.26.201:10008: connect: connection refused
2022/03/13 11:02:09 [WARN] raft: Election timeout reached, restarting election

... ...

2022/03/13 11:06:29 [INFO] raft: Node at 10.227.11.20:10008 [Candidate] entering Candidate state
2022/03/13 11:06:30 [DEBUG] raft: Votes needed: 2
2022/03/13 11:06:30 [DEBUG] raft: Vote granted from 10.227.11.20:10008. Tally: 1
2022/03/13 11:06:30 [ERR] raft: Failed to make RequestVote RPC to 10.227.26.201:10008: dial tcp 10.227.26.201:10008: i/o timeout
2022/03/13 11:06:30 [ERR] raft: Failed to make RequestVote RPC to 10.227.59.84:10008: dial tcp 10.227.59.84:10008: i/o timeout
[martini] Started GET /api/raft-health for 192.168.56.125:42922
[martini] Started GET /api/lb-check for 192.168.56.125:42920
[martini] Completed 500 Internal Server Error in 891.477µs
[martini] Completed 200 OK in 1.634152ms
2022/03/13 11:06:30 [ERR] raft: Failed to make RequestVote RPC to 10.227.26.201:10008: dial tcp 10.227.26.201:10008: connect: connection refused
jicki commented 2 years ago

me too