hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.37k stars 4.42k forks source link

Get stale data from the consul with a destroyed cluster after restart #10989

Open tifling85 opened 3 years ago

tifling85 commented 3 years ago

Overview of the Issue

Hello. There is a destroyed consul cluster. I can get stale data successfully. But after restarting it, the data becomes inaccessible. Is it possible to get cached data from the consul with a broken cluster after a restart? Thanks.

Reproduction Steps

  1. Create a destroyed cluster:

    [centos@tifweb-1 ~]$ consul members
    Node                Address             Status  Type    Build   Protocol  DC   Segment
    tifweb-1.novalocal  10.179.37.210:8301  alive   server  1.10.1  2         dc1  <all>
    tifweb-2.novalocal  10.179.37.156:8301  failed  server  1.10.1  2         dc1  <all>
  2. Checking the availability of staled data:

    [centos@tifweb-1 ~]$ consul kv get -stale test/test
    test
  3. Restart the service:

    [centos@tifweb-1 ~]$ sudo systemctl restart consul
    [centos@tifweb-1 ~]$ consul members
    Node                Address             Status  Type    Build   Protocol  DC   Segment
    tifweb-1.novalocal  10.179.37.210:8301  alive   server  1.10.1  2         dc1  <all>
  4. Trying to get data (unsuccessfully):

    [centos@tifweb-1 ~]$ consul kv get -stale test/test
    Error querying Consul agent: Unexpected response code: 500

Consul info for both Client and Server

Client/Server info ``` agent: check_monitors = 0 check_ttls = 0 checks = 1 services = 1 build: prerelease = revision = db839f18 version = 1.10.1 consul: acl = disabled bootstrap = false known_datacenters = 1 leader = false leader_addr = server = true raft: applied_index = 49162 commit_index = 0 fsm_pending = 0 last_contact = never last_log_index = 49757 last_log_term = 17401 last_snapshot_index = 49162 last_snapshot_term = 17278 latest_configuration = [{Suffrage:Voter ID:4e373bb7-602a-c849-f6bf-270e1c990ac4 Address:10.179.37.210:8300} {Suffrage:Voter ID:96a36e0a-3f87-639d-c9cd-c8e75e829484 Address:10.179.37.156:8300}] latest_configuration_index = 0 num_peers = 1 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Candidate term = 17426 runtime: arch = amd64 cpu_count = 3 goroutines = 104 max_procs = 3 os = linux version = go1.16.6 serf_lan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 21 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 2958 members = 1 query_queue = 0 query_time = 1 serf_wan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 1494 members = 1 query_queue = 0 query_time = 1 ```

Operating system and Environment details

cat /etc/os-release
NAME="CentOS Linux"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Linux 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-8"
CENTOS_MANTISBT_PROJECT_VERSION="8"

Log Fragments

sep 06 05:46:38 tifweb-1.novalocal consul[4821]: 2021-09-06T05:46:38.227Z [ERROR] agent.http: Request error: method=GET url=/v1/kv/test/test?stale= from=127.0.0.1:57412 error="No cluster leader"
sep 06 05:46:38 tifweb-1.novalocal consul[4821]: 2021-09-06T05:46:38.227Z [DEBUG] agent.http: Request finished: method=GET url=/v1/kv/test/test?stale= from=127.0.0.1:57412 latency=7.009072774s
jkirschner-hashicorp commented 3 years ago

Hi @tifling85,

Can you share your Consul server agent configuration (with any sensitive parts removed) and the command you used to start the Consul server agents? In particular, I'm interested in the -bootstrap-expect value which determines how many servers are needed before the initial leader election is triggered.

Can you also share the output of consul operator raft list-peers in steps 2 and 3?

And is there any log information indicating a leader election takes place after restarting the service in 3?

The stale read mode docs do mention that stale reads work while a cluster is unavailable / there is no leader. However, if the cluster was never bootstrapped to begin with / never had an initial leader election, we're not sure stale reads work in that case. The information requested above should help us understand whether a leader was ever elected.

tifling85 commented 3 years ago

Okay, I'll try again. config file(/etc/consul.d/init.json):

{
  "server": true,
  "ui": true,
  "advertise_addr": "10.179.37.248",
  "bind_addr": "10.179.37.248",
  "bootstrap_expect": 2,
  "retry_join": ["10.179.37.214"],
  "enable_local_script_checks": true,
  "log_level": "trace"
}

Second server:

{
  "server": true,
  "ui": true,
  "advertise_addr": "10.179.37.214",
  "bind_addr": "10.179.37.214",
  "bootstrap_expect": 2,
  "retry_join": ["10.179.37.248"],
  "enable_local_script_checks": true,
  "log_level": "trace"
}

1. first start, the cluster was initialized:

[centos@tifweb-1 ~]$ consul members
Node                Address             Status  Type    Build   Protocol  DC   Segment
tifweb-1.novalocal  10.179.37.248:8301  alive   server  1.10.1  2         dc1  <all>
tifweb-2.novalocal  10.179.37.214:8301  alive   server  1.10.1  2         dc1  <all>
[centos@tifweb-1 ~]$ consul operator raft list-peers
Node                ID                                    Address             State     Voter  RaftProtocol
tifweb-2.novalocal  5e103976-5b77-b620-5e55-123bbcbb5884  10.179.37.214:8300  leader    true   3
tifweb-1.novalocal  bdc2ffbf-44d1-5eb4-4f0c-517a8368d983  10.179.37.248:8300  follower  true   3

add a test key:

[centos@tifweb-1 ~]$ consul kv put test_key test_value
Success! Data written to: test_key
[centos@tifweb-1 ~]$ consul kv get test_key
test_value

2. Turn off the server tifweb-2:

 [centos@tifweb-1 ~]$ consul operator raft list-peers
Error getting peers: Failed to retrieve raft configuration: Unexpected response code: 500 (No cluster leader)
[centos@tifweb-1 ~]$ consul members
Node                Address             Status  Type    Build   Protocol  DC   Segment
tifweb-1.novalocal  10.179.37.248:8301  alive   server  1.10.1  2         dc1  <all>
tifweb-2.novalocal  10.179.37.214:8301  failed  server  1.10.1  2         dc1  <all>

Checking the availability of staled data:

[centos@tifweb-1 ~]$ consul kv get -stale test_key
test_value

3. Restart the service on current service(tifweb-1): [centos@tifweb-1 ~]$ sudo systemctl restart consul
Staled key not available:

[centos@tifweb-1 ~]$ consul members
Node                Address             Status  Type    Build   Protocol  DC   Segment
tifweb-1.novalocal  10.179.37.248:8301  alive   server  1.10.1  2         dc1  <all>
[centos@tifweb-1 ~]$ consul kv get -stale test_key
Error querying Consul agent: Unexpected response code: 500
[centos@tifweb-1 ~]$ consul operator raft list-peers
Error getting peers: Failed to retrieve raft configuration: Unexpected response code: 500 (No cluster leader)

consul restart logs:

sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.859+0700 [WARN] agent: bootstrap_expect = 2: A cluster with 2 ser vers will provide no failure tolerance. See https://www.consul.io/docs/internals/consensus.html#deployment-table sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.859+0700 [WARN] agent: bootstrap_expect > 0: expecting 2 servers sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.859+0700 [TRACE] agent.tlsutil: Update: version=1 sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.859+0700 [TRACE] agent.tlsutil: OutgoingRPCWrapper: version=1 sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.859+0700 [TRACE] agent: parsed scheme: "consul" sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.859+0700 [TRACE] agent: ccResolverWrapper: sending update to cc: { [] } sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.862+0700 [WARN] agent.auto_config: skipping file /etc/consul.d/.i nit.json.swp, extension must be .hcl or .json, or config format must be set sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.862+0700 [WARN] agent.auto_config: skipping file /etc/consul.d/co nsul.env, extension must be .hcl or .json, or config format must be set sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.862+0700 [WARN] agent.auto_config: The 'ui' field is deprecated. Use the 'ui_config.enabled' field instead. sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.862+0700 [WARN] agent.auto_config: Node name "tifweb-1.novalocal" will not be discoverable via DNS due to invalid characters. Valid characters include all alpha-numerics and dashes. sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.862+0700 [WARN] agent.auto_config: bootstrap_expect = 2: A cluster with 2 servers will provide no failure tolerance. See https://www.consul.io/docs/internals/consensus.html#deployment-table sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.862+0700 [WARN] agent.auto_config: bootstrap_expect > 0: expecting 2 servers sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.862+0700 [TRACE] agent.tlsutil: Update: version=2 sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.863+0700 [TRACE] agent.tlsutil: OutgoingRPCWrapper: version=2 sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.994+0700 [INFO] agent.server.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:bdc2ffbf-44d1-5eb4-4f0c-517a8368d983 Address:10.179.37.248:8300} {Suffrage:Voter ID:5e103976-5b77-b620-5e55-123bbcbb5884 Address:10.179.37.214:8300}]" sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.994+0700 [INFO] agent.server.raft: entering follower state: follower="Node at 10.179.37.248:8300 [Follower]" leader= sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.994+0700 [INFO] agent.server.serf.wan: serf: EventMemberJoin: tifweb-1.novalocal.dc1 10.179.37.248 sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [WARN] agent.server.serf.wan: serf: Failed to re-join any previously known node sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [INFO] agent.server.serf.lan: serf: EventMemberJoin: tifweb-1.novalocal 10.179.37.248 sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [INFO] agent.router: Initializing LAN area manager sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [WARN] agent.server.serf.lan: serf: Failed to re-join any previously known node sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: ccResolverWrapper: sending update to cc: {[{10.179.37.248:8300 0 tifweb-1.novalocal }] } sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: ccResolverWrapper: sending update to cc: {[{10.179.37.248:8300 0 tifweb-1.novalocal }] } sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: addrConn: tryUpdateAddrs curAddr: { 0 }, addrs: [{10.179.37.248:8300 0 tifweb-1.novalocal }] sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: ccResolverWrapper: sending update to cc: {[{10.179.37.248:8300 0 tifweb-1.novalocal.dc1 }] } sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: addrConn: tryUpdateAddrs curAddr: { 0 }, addrs: [{10.179.37.248:8300 0 tifweb-1.novalocal.dc1 }] sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: ccResolverWrapper: sending update to cc: {[{10.179.37.248:8300 0 tifweb-1.novalocal.dc1 }] } sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: addrConn: tryUpdateAddrs curAddr: { 0 }, addrs: [{10.179.37.248:8300 0 tifweb-1.novalocal.dc1 }] sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [INFO] agent.server: Adding LAN server: server="tifweb-1.novalocal (Addr: tcp/10.179.37.248:8300) (DC: dc1)" sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [INFO] agent.server: Raft data found, disabling bootstrap mode sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [INFO] agent: Started DNS server: address=127.0.0.1:8600 network=udp sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [INFO] agent: Started DNS server: address=10.179.37.248:8600 network=tcp sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [INFO] agent: Started DNS server: address=127.0.0.1:8600 network=tcp sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: ccResolverWrapper: sending update to cc: {[{10.179.37.248:8300 0 tifweb-1.novalocal.dc1 }] } sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: addrConn: tryUpdateAddrs curAddr: { 0 }, addrs: [{10.179.37.248:8300 0 tifweb-1.novalocal.dc1 }] sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [INFO] agent: Started DNS server: address=10.179.37.248:8600 network=udp sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [WARN] agent: grpc: addrConn.createTransport failed to connect to {10.179.37.248:8300 0 tifweb-1.novalocal.dc1 }. Err :connection error: desc = "transport: Error while dialing dial tcp 10.179.37.248:8300: operation was canceled". Reconnecting... sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: ccResolverWrapper: sending update to cc: {[{10.179.37.248:8300 0 tifweb-1.novalocal.dc1 }] } sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [TRACE] agent: addrConn: tryUpdateAddrs curAddr: { 0 }, addrs: [{10.179.37.248:8300 0 tifweb-1.novalocal.dc1 }] sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [INFO] agent.server: Handled event for server in area: event=member-join server=tifweb-1.novalocal.dc1 area=wan sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.996+0700 [INFO] agent: Starting server: address=10.179.37.248:8500 network=tcp protocol=http sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.996+0700 [INFO] agent: Starting server: address=127.0.0.1:8500 network=tcp protocol=http sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.996+0700 [WARN] agent: DEPRECATED Backwards compatibility with pre-1.9 metrics enabled. These metrics will be removed in a future version of Consul. Set telemetry { disable_compat_1.9 = true } to disable them. sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.996+0700 [INFO] agent: Retry join is supported for the following discovery methods: cluster=LAN discovery_methods="aliyun aws azure digitalocean gce k8s linode mdns os packet scaleway softlayer tencentcloud triton vsphere" sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.996+0700 [INFO] agent: Joining cluster...: cluster=LAN sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.996+0700 [INFO] agent: (LAN) joining: lan_addresses=[10.179.37.214] sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.996+0700 [INFO] agent: started state syncer sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.996+0700 [INFO] agent: Consul agent running! sep 17 17:36:37 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:37.162+0700 [DEBUG] agent.http: Request finished: method=GET url=/v1/agent/members?segment=_all from=127.0.0.1:48472 latency=87.135µs sep 17 17:36:39 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:39.418+0700 [ERROR] agent.anti_entropy: failed to sync remote state: error="No cluster leader" sep 17 17:36:40 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:40.994+0700 [WARN] agent.server.raft: heartbeat timeout reached, starting election: last-leader= sep 17 17:36:40 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:40.994+0700 [INFO] agent.server.raft: entering candidate state: node="Node at 10.179.37.248:8300 [Candidate]" term=144 sep 17 17:36:41 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:41.340+0700 [DEBUG] agent.server.raft: votes: needed=2 sep 17 17:36:41 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:41.340+0700 [DEBUG] agent.server.raft: vote granted: from=bdc2ffbf-44d1-5eb4-4f0c-517a8368d983 term=144 tally=1 sep 17 17:36:41 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:41.340+0700 [WARN] agent.server.raft: unable to get address for server, using fallback address: id=5e103976-5b77-b620-5e55-123bbcbb5884 fallback=10.179.37.214:8300 error="Could not find address for server id 5e103976-5b77-b620-5e55-123bbcbb5884" sep 17 17:36:42 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:42.000+0700 [DEBUG] agent.server.memberlist.lan: memberlist: Failed to join 10.179.37.214: dial tcp 10.179.37.214:8301: i/o timeout sep 17 17:36:42 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:42.000+0700 [WARN] agent: (LAN) couldn't join: number_of_nodes=0 error="1 error occurred: sep 17 17:36:42 tifweb-1.novalocal consul[70258]: * Failed to join 10.179.37.214: dial tcp 10.179.37.214:8301: i/o timeout sep 17 17:36:42 tifweb-1.novalocal consul[70258]: " sep 17 17:36:42 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:42.000+0700 [WARN] agent: Join cluster failed, will retry: cluster=LAN retry_interval=30s error= sep 17 17:36:50 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:50.560+0700 [WARN] agent.server.raft: Election timeout reached, restarting election sep 17 17:36:50 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:50.560+0700 [INFO] agent.server.raft: entering candidate state: node="Node at 10.179.37.248:8300 [Candidate]" term=145 sep 17 17:36:50 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:50.865+0700 [DEBUG] agent.server.raft: votes: needed=2 sep 17 17:36:50 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:50.865+0700 [DEBUG] agent.server.raft: vote granted: from=bdc2ffbf-44d1-5eb4-4f0c-517a8368d983 term=145 tally=1 sep 17 17:36:50 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:50.865+0700 [WARN] agent.server.raft: unable to get address for server, using fallback address: id=5e103976-5b77-b620-5e55-123bbcbb5884 fallback=10.179.37.214:8300 error="Could not find address for server id 5e103976-5b77-b620-5e55-123bbcbb5884" sep 17 17:36:51 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:51.341+0700 [ERROR] agent.server.raft: failed to make requestVote RPC: target="{Voter 5e103976-5b77-b620-5e55-123bbcbb5884 10.179.37.214:8300}" error="dial tcp 10.179.37.248:0->10.179.37.214:8300: i/o timeout" sep 17 17:37:00 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:00.214+0700 [ERROR] agent: Coordinate update error: error="No cluster leader" sep 17 17:37:00 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:00.335+0700 [WARN] agent.server.raft: Election timeout reached, restarting election sep 17 17:37:00 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:00.335+0700 [INFO] agent.server.raft: entering candidate state: node="Node at 10.179.37.248:8300 [Candidate]" term=146 sep 17 17:37:00 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:00.458+0700 [INFO] agent: Newer Consul version available: new_version=1.10.2 current_version=1.10.1 sep 17 17:37:00 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:00.469+0700 [DEBUG] agent.server.raft: votes: needed=2 sep 17 17:37:00 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:00.469+0700 [DEBUG] agent.server.raft: vote granted: from=bdc2ffbf-44d1-5eb4-4f0c-517a8368d983 term=146 tally=1 sep 17 17:37:00 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:00.469+0700 [WARN] agent.server.raft: unable to get address for server, using fallback address: id=5e103976-5b77-b620-5e55-123bbcbb5884 fallback=10.179.37.214:8300 error="Could not find address for server id 5e103976-5b77-b620-5e55-123bbcbb5884" sep 17 17:37:00 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:00.866+0700 [ERROR] agent.server.raft: failed to make requestVote RPC: target="{Voter 5e103976-5b77-b620-5e55-123bbcbb5884 10.179.37.214:8300}" error="dial tcp 10.179.37.248:0->10.179.37.214:8300: i/o timeout" sep 17 17:37:02 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:02.042+0700 [ERROR] agent.anti_entropy: failed to sync remote state: error="No cluster leader" sep 17 17:37:02 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:02.114+0700 [ERROR] agent.http: Request error: method=GET url=/v1/operator/raft/configuration from=127.0.0.1:48474 error="No cluster leader" sep 17 17:37:02 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:02.114+0700 [DEBUG] agent.http: Request finished: method=GET url=/v1/operator/raft/configuration from=127.0.0.1:48474 latency=7.147320577s sep 17 17:37:07 tifweb-1.novalocal consul[70258]: 2021-09-17T17:37:07.385+0700 [WARN] agent.server.raft: Election timeout reached, restarting election

I see the log:

sep 17 17:36:31 tifweb-1.novalocal consul[70258]: 2021-09-17T17:36:31.995+0700 [INFO] agent.server: Raft data found, disabling bootstrap mode

i guess consul will not initiate the cluster again.

Thanks!