Closed srimandarbha closed 3 years ago
Hi, does the UI load? Can you show a screenshot of what you're seeing?
Hello @ikysow
I am seeing the below errors in the pod logs
2020/01/18 18:28:16 [WARN] agent/proxy: running as root, will not start managed proxies
2020/01/18 18:28:16 [WARN] memberlist: Failed to resolve hashicorp-consul-server-0.hashicorp-consul-server.default.svc: lookup hashicorp-consul-server-0.hashicorp-consul-server.default.svc on 169.254.25.10:53: no such host
2020/01/18 18:28:16 [WARN] memberlist: Failed to resolve hashicorp-consul-server-1.hashicorp-consul-server.default.svc: lookup hashicorp-consul-server-1.hashicorp-consul-server.default.svc on 169.254.25.10:53: no such host
2020/01/18 18:28:16 [WARN] memberlist: Failed to resolve hashicorp-consul-server-2.hashicorp-consul-server.default.svc: lookup hashicorp-consul-server-2.hashicorp-consul-server.default.svc on 169.254.25.10:53: no such host
2020/01/18 18:28:16 [WARN] agent: (LAN) couldn't join: 0 Err: 3 errors occurred:
* Failed to resolve hashicorp-consul-server-0.hashicorp-consul-server.default.svc: lookup hashicorp-consul-server-0.hashicorp-consul-server.default.svc on 169.254.25.10:53: no such host
* Failed to resolve hashicorp-consul-server-1.hashicorp-consul-server.default.svc: lookup hashicorp-consul-server-1.hashicorp-consul-server.default.svc on 169.254.25.10:53: no such host
* Failed to resolve hashicorp-consul-server-2.hashicorp-consul-server.default.svc: lookup hashicorp-consul-server-2.hashicorp-consul-server.default.svc on 169.254.25.10:53: no such host
2020/01/18 18:28:16 [WARN] agent: Join LAN failed: <nil>, retrying in 30s
2020/01/18 18:28:23 [WARN] raft: Heartbeat timeout from "" reached, starting election
2020/01/18 18:28:23 [WARN] raft: Unable to get address for server id 3142e90d-69d6-5d2c-0de8-067e69dfd262, using fallback address 10.233.96.55:8300: Could not find address for server id 3142e90d-69d6-5d2c-0de8-067e69dfd262
2020/01/18 18:28:23 [WARN] raft: Unable to get address for server id 582991d7-8a11-cd53-b6a6-2096e2508673, using fallback address 10.233.90.56:8300: Could not find address for server id 582991d7-8a11-cd53-b6a6-2096e2508673
2020/01/18 18:28:23 [ERR] agent: failed to sync remote state: No cluster leader
2020/01/18 18:28:30 [WARN] raft: Election timeout reached, restarting election
2020/01/18 18:28:30 [WARN] raft: Unable to get address for server id 3142e90d-69d6-5d2c-0de8-067e69dfd262, using fallback address 10.233.96.55:8300: Could not find address for server id 3142e90d-69d6-5d2c-0de8-067e69dfd262
2020/01/18 18:28:30 [WARN] raft: Unable to get address for server id 582991d7-8a11-cd53-b6a6-2096e2508673, using fallback address 10.233.90.56:8300: Could not find address for server id 582991d7-8a11-cd53-b6a6-2096e2508673
2020/01/18 18:28:33 [ERROR] raft: Failed to make RequestVote RPC to {Voter 3142e90d-69d6-5d2c-0de8-067e69dfd262 10.233.96.55:8300}: dial tcp <nil>->10.233.96.55:8300: i/o timeout
2020/01/18 18:28:33 [ERROR] raft: Failed to make RequestVote RPC to {Voter 582991d7-8a11-cd53-b6a6-2096e2508673 10.233.90.56:8300}: dial tcp <nil>->10.233.90.56:8300: i/o timeout
2020/01/18 18:28:35 [WARN] raft: Election timeout reached, restarting election
2020/01/18 18:28:35 [WARN] raft: Unable to get address for server id 3142e90d-69d6-5d2c-0de8-067e69dfd262, using fallback address 10.233.96.55:8300: Could not find address for server id 3142e90d-69d6-5d2c-0de8-067e69dfd262
2020/01/18 18:28:36 [WARN] raft: Unable to get address for server id 582991d7-8a11-cd53-b6a6-2096e2508673, using fallback address 10.233.90.56:8300: Could not find address for server id 582991d7-8a11-cd53-b6a6-2096e2508673
2020/01/18 18:28:40 [ERROR] raft: Failed to make RequestVote RPC to {Voter 3142e90d-69d6-5d2c-0de8-067e69dfd262 10.233.96.55:8300}: dial tcp <nil>->10.233.96.55:8300: i/o timeout
2020/01/18 18:28:40 [ERROR] raft: Failed to make RequestVote RPC to {Voter 582991d7-8a11-cd53-b6a6-2096e2508673 10.233.90.56:8300}: dial tcp <nil>->10.233.90.56:8300: i/o timeout
2020/01/18 18:28:44 [WARN] raft: Election timeout reached, restarting election
2020/01/18 18:28:45 [ERROR] raft: Failed to make RequestVote RPC to {Voter 3142e90d-69d6-5d2c-0de8-067e69dfd262 10.233.96.55:8300}: dial tcp <nil>->10.233.96.55:8300: i/o timeout
2020/01/18 18:28:46 [ERROR] raft: Failed to make RequestVote RPC to {Voter 582991d7-8a11-cd53-b6a6-2096e2508673 10.233.90.56:8300}: dial tcp <nil>->10.233.90.56:8300: i/o timeout
root@node1:~# k get pods
NAME READY STATUS RESTARTS AGE
hashicorp-consul-connect-injector-webhook-deployment-764c8t4v4t 1/1 Running 0 5m45s
hashicorp-consul-d72qg 1/1 Running 0 5m45s
hashicorp-consul-hn4m5 1/1 Running 0 5m45s
hashicorp-consul-l6lbx 1/1 Running 0 5m45s
hashicorp-consul-server-0 1/1 Running 0 5m44s
hashicorp-consul-server-1 1/1 Running 0 5m43s
hashicorp-consul-server-2 1/1 Running 0 5m43s
hashicorp-consul-sync-catalog-78cd697b7-wcdjx 1/1 Running 0 5m45s
root@node1:~#
consul helm values file i am using
global:
enabled: true
image: "consul:1.5.3"
datacenter: bndev
server:
enabled: true
client:
enabled: true
grpc: true
extraConfig: |
{
"retry_join": ["provider=k8s tag_key=Consul-Auto-Join tag_value=bndev"]
}
dns:
enabled: true
ui:
enabled: true
service:
enabled: "LoadBalancer"
syncCatalog:
enabled: true
connectInject:
enabled: true
# consul members
Node Address Status Type Build Protocol DC Segment
hashicorp-consul-server-0 10.233.92.104:8301 alive server 1.5.3 2 bndev <all>
hashicorp-consul-server-1 10.233.90.58:8301 alive server 1.5.3 2 bndev <all>
hashicorp-consul-server-2 10.233.96.57:8301 alive server 1.5.3 2 bndev <all>
node1 10.233.90.57:8301 alive client 1.5.3 2 bndev <default>
node2 10.233.96.56:8301 alive client 1.5.3 2 bndev <default>
node3 10.233.92.102:8301 alive client 1.5.3 2 bndev <default>
* Connected to localhost (127.0.0.1) port 8500 (#0)
> GET /v1/catalog/nodes HTTP/1.1
> Host: localhost:8500
> User-Agent: curl/7.64.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json
< Vary: Accept-Encoding
< X-Consul-Effective-Consistency: leader
< X-Consul-Index: 7
< X-Consul-Knownleader: true
< X-Consul-Lastcontact: 0
< Date: Sat, 18 Jan 2020 18:43:02 GMT
< Content-Length: 259
<
* Connection #0 to host localhost left intact
[{"ID":"582991d7-8a11-cd53-b6a6-2096e2508673","Node":"hashicorp-consul-server-2","Address":"10.233.90.56","Datacenter":"bndev","TaggedAddresses":{"lan":"10.233.90.56","wan":"10.233.90.56"},"Meta":{"consul-network-segment":""},"CreateIndex":7,"M/ #
I have coredns configured on kubernetes, but i see in the logs errors related to dns resolution pasted above. I am able to resolve the dns names manually
Is there a change that you have old pvcs from a previous installation? Are you able to do a helm delete
and also a kubectl delete pvc
? See https://www.consul.io/docs/platform/k8s/uninstalling.html.
Hi @lkysow, i have manually deleted pvc & pv along with removing the contents under local-storage storageclass after the helm uninstall . I have tried the helm install using helm 3 & helm 2 as well but still have the same issue.
After a reinstall for 3rd time, i am not able to list any nodes using the list api. "Consul members" lists the servers & clients, but "consul info" command get hung
Do you still see the:
2020/01/18 18:28:46 [ERROR] raft: Failed to make RequestVote RPC to {Voter 582991d7-8a11-cd53-b6a6-2096e2508673 10.233.90.56:8300}: dial tcp <nil>->10.233.90.56:8300: i/o timeout
Errors?
That's using the server pod ips on port 8300. Can you make TCP connections to them? (nc -v 10.233.90.56 8300
)?
I don't see the ips , even in the list-peers i dont see these ips and i am Seeing different ips with unknown tag. Looks like this might be one reason and also i observed when the look up fails on one of the server pod during the pod creation out of the 3 replicas, it is not listed in v1/catalog/nodes. When i am reinstalling the helm deployment (cleaning up the pvcs , pv and files under local-storage) i am still seeing this issue.
We are using kubernetes 1.17 ,3node setup deployed through kubespray . We are using calico as network . Consul version 1.5.3
Let's try and simplify things by running with one server:
server:
replicas: 1
bootstrapExpect: 1
Also, why do you have this client config:
client:
extraConfig: |
{
"retry_join": ["provider=k8s tag_key=Consul-Auto-Join tag_value=bndev"]
}
The clients should automatically join with the servers without that. Is there a specific reason you have that?
When running with 1 server please send me:
kubectl get pod
Hi @lkysow
Removing the extra_configs worked for me
root@node1:~# k exec -it hashicorp-consul-server-0 consul members
Node Address Status Type Build Protocol DC Segment
hashicorp-consul-server-0 10.233.92.171:8301 alive server 1.5.3 2 bn <all>
hashicorp-consul-server-1 10.233.96.90:8301 alive server 1.5.3 2 bn <all>
hashicorp-consul-server-2 10.233.90.91:8301 alive server 1.5.3 2 bn <all>
node1 10.233.90.92:8301 alive client 1.5.3 2 bn <default>
node2 10.233.96.91:8301 alive client 1.5.3 2 bn <default>
node3 10.233.92.170:8301 alive client 1.5.3 2 bn <default>
root@node1:~# k exec -it hashicorp-consul-server-0 curl localhost:8500/v1/catalog/nodes | jq .[].Node
"hashicorp-consul-server-0"
"hashicorp-consul-server-1"
"hashicorp-consul-server-2"
"k8s-sync"
"node1"
"node2"
"node3"
root@node1:~# consul operator raft list-peers
Node ID Address State Voter RaftProtocol
(unknown) 149b183d-e86e-9634-fe0b-c55c9bcdb58c 10.233.92.169:8300 follower true unknown
hashicorp-consul-server-1 46e3e0aa-a3f1-9291-9b3f-4bb4915d30d1 10.233.96.90:8300 follower false 3
hashicorp-consul-server-2 bbd1a40b-8e5c-a311-af21-e31e96f07a63 10.233.90.91:8300 follower false 3
But i am still seeing i/o timeout errors to raft
2020/01/21 06:23:57 [ERR] consul: failed to reconcile member: {hashicorp-consul-server-0 10.233.92.171 8301 map[acls:0 build:1.5.3:a42ded47 dc:bn expect:3 id:149b183d-e86e-9634-fe0b-c55c9bcdb58c port:8300 raft_vsn:3 role:consul segment: vsn:2 vsn_max:3 vsn_min:2 wan_join_port:8302] alive 1 5 2 2 5 4}: error removing server with duplicate ID "149b183d-e86e-9634-fe0b-c55c9bcdb58c": Need at least one voter in configuration: {[{Nonvoter 46e3e0aa-a3f1-9291-9b3f-4bb4915d30d1 10.233.96.90:8300} {Nonvoter bbd1a40b-8e5c-a311-af21-e31e96f07a63 10.233.90.91:8300}]}
2020/01/21 06:24:06 [ERROR] raft: Failed to AppendEntries to {Nonvoter 46e3e0aa-a3f1-9291-9b3f-4bb4915d30d1 10.233.96.90:8300}: read tcp 10.233.92.171:37200->10.233.96.90:8300: i/o timeout
2020/01/21 06:24:06 [ERROR] raft: Failed to AppendEntries to {Nonvoter bbd1a40b-8e5c-a311-af21-e31e96f07a63 10.233.90.91:8300}: read tcp 10.233.92.171:58092->10.233.90.91:8300: i/o timeout
2020/01/21 06:24:26 [ERROR] raft: Failed to AppendEntries to {Nonvoter 46e3e0aa-a3f1-9291-9b3f-4bb4915d30d1 10.233.96.90:8300}: read tcp 10.233.92.171:37416->10.233.96.90:8300: i/o timeout
2020/01/21 06:24:26 [ERROR] raft: Failed to AppendEntries to {Nonvoter bbd1a40b-8e5c-a311-af21-e31e96f07a63 10.233.90.91:8300}: read tcp 10.233.92.171:58302->10.233.90.91:8300: i/o timeout
I have checked the port connectivity manually, its working as expected inside the pods
Thanks
We are seeing more of these errors on consul servers and also unable to hit the consul ui very often. UI is taking lot of time to fetch details of nodes & services
logs:
2020/01/21 13:51:20 [ERR] yamux: keepalive failed: i/o deadline reached
2020/01/21 13:51:20 [ERR] yamux: Failed to read stream data: read tcp 10.233.90.114:58284->10.233.92.210:8300: use of closed network connection
2020/01/21 13:51:20 [WARN] yamux: failed to send go away: session shutdown
2020/01/21 13:51:20 [ERR] agent: Coordinate update error: rpc error making call: EOF
output of kubectl get pods NAME READY STATUS RESTARTS AGE counting-service-deployment 2/2 Running 0 101m dashboard-service 2/2 Running 0 100m hashicorp-consul-72h47 1/1 Running 0 29m hashicorp-consul-connect-injector-webhook-deployment-764c8npqzx 1/1 Running 0 36m hashicorp-consul-fjccd 1/1 Running 0 28m hashicorp-consul-g7hxs 1/1 Running 0 28m hashicorp-consul-server-0 1/1 Running 0 27m hashicorp-consul-server-1 1/1 Running 0 28m hashicorp-consul-server-2 1/1 Running 0 29m hashicorp-consul-sync-catalog-75976cbcb4-nvpw2 1/1 Running 2 36m
server logs
root@node1:~# k logs hashicorp-consul-server-0
bootstrap_expect > 0: expecting 3 servers
==> Starting Consul agent...
Version: 'v1.5.3'
Node ID: '28b2d3e6-735c-d9e0-1d45-ee5e5a841268'
Node name: 'hashicorp-consul-server-0'
Datacenter: 'bn' (Segment: '
==> Log data will now stream in as it occurs:
2020/01/21 14:03:18 [INFO] raft: Initial configuration (index=1405): [{Suffrage:Voter ID:28b2d3e6-735c-d9e0-1d45-ee5e5a841268 Address:10.233.92.206:8300} {Suffrage:Nonvoter ID:56a95e71-0c36-27c1-1d04-398d641b32aa Address:10.233.96.116:8300} {Suffrage:Nonvoter ID:e9e2741d-0edc-33dc-7b4c-845bfdad74c8 Address:10.233.90.115:8300}]
2020/01/21 14:03:18 [INFO] raft: Node at 10.233.92.211:8300 [Follower] entering Follower state (Leader: "")
2020/01/21 14:03:18 [INFO] serf: EventMemberJoin: hashicorp-consul-server-0.bn 10.233.92.211
2020/01/21 14:03:18 [INFO] serf: EventMemberJoin: hashicorp-consul-server-0 10.233.92.211
2020/01/21 14:03:18 [INFO] consul: Adding LAN server hashicorp-consul-server-0 (Addr: tcp/10.233.92.211:8300) (DC: bn)
2020/01/21 14:03:18 [INFO] consul: Raft data found, disabling bootstrap mode
2020/01/21 14:03:18 [INFO] consul: Handled member-join event for server "hashicorp-consul-server-0.bn" in area "wan"
2020/01/21 14:03:18 [WARN] agent/proxy: running as root, will not start managed proxies
2020/01/21 14:03:18 [INFO] agent: Started DNS server 0.0.0.0:8600 (udp)
2020/01/21 14:03:18 [INFO] agent: Started DNS server 0.0.0.0:8600 (tcp)
2020/01/21 14:03:18 [INFO] agent: Started HTTP server on [::]:8500 (tcp)
2020/01/21 14:03:18 [INFO] agent: Retry join LAN is supported for: aliyun aws azure digitalocean gce k8s mdns os packet scaleway softlayer triton vsphere
2020/01/21 14:03:18 [INFO] agent: Joining LAN cluster...
2020/01/21 14:03:18 [INFO] agent: (LAN) joining: [hashicorp-consul-server-0.hashicorp-consul-server.default.svc hashicorp-consul-server-1.hashicorp-consul-server.default.svc hashicorp-consul-server-2.hashicorp-consul-server.default.svc]
2020/01/21 14:03:18 [INFO] agent: started state syncer
==> Consul agent running! 2020/01/21 14:03:18 [WARN] memberlist: Failed to resolve hashicorp-consul-server-0.hashicorp-consul-server.default.svc: lookup hashicorp-consul-server-0.hashicorp-consul-server.default.svc on 169.254.25.10:53: no such host 2020/01/21 14:03:18 [INFO] serf: EventMemberJoin: node3 10.233.92.209 2020/01/21 14:03:18 [INFO] serf: EventMemberJoin: node2 10.233.96.115 2020/01/21 14:03:18 [INFO] serf: EventMemberJoin: node1 10.233.90.113 2020/01/21 14:03:18 [INFO] serf: EventMemberJoin: hashicorp-consul-server-1 10.233.90.115 2020/01/21 14:03:18 [WARN] memberlist: Refuting a suspect message (from: hashicorp-consul-server-0) 2020/01/21 14:03:18 [INFO] serf: EventMemberJoin: hashicorp-consul-server-2 10.233.96.116 2020/01/21 14:03:18 [INFO] consul: Adding LAN server hashicorp-consul-server-1 (Addr: tcp/10.233.90.115:8300) (DC: bn) 2020/01/21 14:03:18 [INFO] consul: Adding LAN server hashicorp-consul-server-2 (Addr: tcp/10.233.96.116:8300) (DC: bn) 2020/01/21 14:03:18 [WARN] memberlist: Refuting a suspect message (from: hashicorp-consul-server-0.bn) 2020/01/21 14:03:18 [INFO] serf: EventMemberJoin: hashicorp-consul-server-2.bn 10.233.96.116 2020/01/21 14:03:18 [INFO] serf: EventMemberJoin: hashicorp-consul-server-1.bn 10.233.90.115 2020/01/21 14:03:18 [INFO] consul: Handled member-join event for server "hashicorp-consul-server-2.bn" in area "wan" 2020/01/21 14:03:18 [INFO] consul: Handled member-join event for server "hashicorp-consul-server-1.bn" in area "wan" 2020/01/21 14:03:18 [INFO] agent: (LAN) joined: 2 2020/01/21 14:03:18 [INFO] agent: Join LAN completed. Synced with 2 initial agents 2020/01/21 14:03:25 [ERR] agent: failed to sync remote state: No cluster leader 2020/01/21 14:03:27 [WARN] raft: Heartbeat timeout from "" reached, starting election 2020/01/21 14:03:27 [INFO] raft: Node at 10.233.92.211:8300 [Candidate] entering Candidate state in term 5 2020/01/21 14:03:27 [INFO] raft: Election won. Tally: 1 2020/01/21 14:03:27 [INFO] raft: Node at 10.233.92.211:8300 [Leader] entering Leader state 2020/01/21 14:03:27 [INFO] raft: Added peer 56a95e71-0c36-27c1-1d04-398d641b32aa, starting replication 2020/01/21 14:03:27 [INFO] raft: Added peer e9e2741d-0edc-33dc-7b4c-845bfdad74c8, starting replication 2020/01/21 14:03:27 [INFO] consul: cluster leadership acquired 2020/01/21 14:03:27 [INFO] consul: New leader elected: hashicorp-consul-server-0 2020/01/21 14:03:27 [WARN] raft: AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300} rejected, sending older logs (next: 1) 2020/01/21 14:03:27 [WARN] raft: AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300} rejected, sending older logs (next: 1) 2020/01/21 14:03:27 [WARN] consul.fsm: EnsureRegistration failed: failed inserting node: Error while renaming Node ID: "b3d97f19-ebe0-f5da-d4b1-20145eede1b3": Node name node3 is reserved by node 5ed969fe-9b59-d190-92dd-9e0c8d91574d with name node3 2020/01/21 14:03:27 [WARN] consul.fsm: EnsureRegistration failed: failed inserting node: Error while renaming Node ID: "b3d97f19-ebe0-f5da-d4b1-20145eede1b3": Node name node3 is reserved by node 5ed969fe-9b59-d190-92dd-9e0c8d91574d with name node3 2020/01/21 14:03:27 [ERR] consul: failed to reconcile member: {hashicorp-consul-server-0 10.233.92.211 8301 map[acls:0 build:1.5.3:a42ded47 dc:bn expect:3 id:28b2d3e6-735c-d9e0-1d45-ee5e5a841268 port:8300 raft_vsn:3 role:consul segment: vsn:2 vsn_max:3 vsn_min:2 wan_join_port:8302] alive 1 5 2 2 5 4}: error removing server with duplicate ID "28b2d3e6-735c-d9e0-1d45-ee5e5a841268": Need at least one voter in configuration: {[{Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300} {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}]} 2020/01/21 14:03:27 [INFO] agent: Synced node info 2020/01/21 14:03:37 [ERROR] raft: Failed to AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}: read tcp 10.233.92.211:44996->10.233.90.115:8300: i/o timeout 2020/01/21 14:03:37 [ERROR] raft: Failed to AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300}: read tcp 10.233.92.211:46696->10.233.96.116:8300: i/o timeout ==> Newer Consul version available: 1.6.2 (currently running: 1.5.3) 2020/01/21 14:03:47 [ERROR] raft: Failed to AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}: read tcp 10.233.92.211:45000->10.233.90.115:8300: i/o timeout 2020/01/21 14:03:47 [ERROR] raft: Failed to AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300}: read tcp 10.233.92.211:46704->10.233.96.116:8300: i/o timeout 2020/01/21 14:03:57 [ERROR] raft: Failed to AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}: read tcp 10.233.92.211:45092->10.233.90.115:8300: i/o timeout 2020/01/21 14:03:57 [ERROR] raft: Failed to AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300}: read tcp 10.233.92.211:46798->10.233.96.116:8300: i/o timeout 2020/01/21 14:04:07 [ERROR] raft: Failed to AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}: read tcp 10.233.92.211:45188->10.233.90.115:8300: i/o timeout 2020/01/21 14:04:07 [ERROR] raft: Failed to AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300}: read tcp 10.233.92.211:46894->10.233.96.116:8300: i/o timeout 2020/01/21 14:04:17 [ERROR] raft: Failed to AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}: read tcp 10.233.92.211:45284->10.233.90.115:8300: i/o timeout 2020/01/21 14:04:17 [ERROR] raft: Failed to AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300}: read tcp 10.233.92.211:46980->10.233.96.116:8300: i/o timeout 2020/01/21 14:04:27 [ERR] consul: failed to reconcile member: {hashicorp-consul-server-0 10.233.92.211 8301 map[acls:0 build:1.5.3:a42ded47 dc:bn expect:3 id:28b2d3e6-735c-d9e0-1d45-ee5e5a841268 port:8300 raft_vsn:3 role:consul segment: vsn:2 vsn_max:3 vsn_min:2 wan_join_port:8302] alive 1 5 2 2 5 4}: error removing server with duplicate ID "28b2d3e6-735c-d9e0-1d45-ee5e5a841268": Need at least one voter in configuration: {[{Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300} {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}]} 2020/01/21 14:04:27 [ERROR] raft: Failed to AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}: read tcp 10.233.92.211:45374->10.233.90.115:8300: i/o timeout 2020/01/21 14:04:27 [ERROR] raft: Failed to AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300}: read tcp 10.233.92.211:47078->10.233.96.116:8300: i/o timeout 2020/01/21 14:04:37 [ERR] yamux: keepalive failed: i/o deadline reached 2020/01/21 14:04:37 [ERR] consul.rpc: multiplex conn accept failed: keepalive timeout from=10.233.90.115:54658 2020/01/21 14:04:37 [ERROR] raft: Failed to AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}: read tcp 10.233.92.211:45468->10.233.90.115:8300: i/o timeout 2020/01/21 14:04:37 [ERROR] raft: Failed to AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300}: read tcp 10.233.92.211:47164->10.233.96.116:8300: i/o timeout 2020/01/21 14:04:48 [ERROR] raft: Failed to AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}: read tcp 10.233.92.211:45584->10.233.90.115:8300: i/o timeout 2020/01/21 14:04:48 [ERROR] raft: Failed to AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300}: read tcp 10.233.92.211:47290->10.233.96.116:8300: i/o timeout 2020/01/21 14:04:58 [ERROR] raft: Failed to AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}: read tcp 10.233.92.211:45676->10.233.90.115:8300: i/o timeout 2020/01/21 14:04:58 [ERROR] raft: Failed to AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300}: read tcp 10.233.92.211:47388->10.233.96.116:8300: i/o timeout 2020/01/21 14:05:06 [ERR] yamux: keepalive failed: i/o deadline reached 2020/01/21 14:05:06 [ERR] consul.rpc: multiplex conn accept failed: keepalive timeout from=10.233.96.116:51582 2020/01/21 14:05:10 [ERROR] raft: Failed to AppendEntries to {Nonvoter e9e2741d-0edc-33dc-7b4c-845bfdad74c8 10.233.90.115:8300}: read tcp 10.233.92.211:45782->10.233.90.115:8300: i/o timeout 2020/01/21 14:05:10 [ERROR] raft: Failed to AppendEntries to {Nonvoter 56a95e71-0c36-27c1-1d04-398d641b32aa 10.233.96.116:8300}: read tcp 10.233.92.211:47480->10.233.96.116:8300: i/o timeout root@node1:~#
==> Log data will now stream in as it occurs:
2020/01/21 13:29:41 [INFO] serf: EventMemberJoin: node3 10.233.92.209
2020/01/21 13:29:41 [WARN] agent/proxy: running as root, will not start managed proxies
2020/01/21 13:29:41 [INFO] agent: Started DNS server 0.0.0.0:8600 (tcp)
2020/01/21 13:29:41 [INFO] agent: Started DNS server 0.0.0.0:8600 (udp)
2020/01/21 13:29:41 [INFO] agent: Started HTTP server on [::]:8500 (tcp)
2020/01/21 13:29:41 [INFO] agent: Started gRPC server on [::]:8502 (tcp)
2020/01/21 13:29:41 [INFO] agent: Retry join LAN is supported for: aliyun aws azure digitalocean gce k8s mdns os packet scaleway softlayer triton vsphere
2020/01/21 13:29:41 [INFO] agent: Joining LAN cluster...
2020/01/21 13:29:41 [INFO] agent: (LAN) joining: [hashicorp-consul-server-0.hashicorp-consul-server.default.svc hashicorp-consul-server-1.hashicorp-consul-server.default.svc hashicorp-consul-server-2.hashicorp-consul-server.default.svc]
2020/01/21 13:29:41 [WARN] manager: No servers available
2020/01/21 13:29:41 [ERR] agent: failed to sync remote state: No known Consul servers
2020/01/21 13:29:41 [INFO] agent: started state syncer
==> Consul agent running!
2020/01/21 13:29:41 [INFO] serf: EventMemberJoin: hashicorp-consul-server-2 10.233.96.114
2020/01/21 13:29:41 [INFO] consul: adding server hashicorp-consul-server-2 (Addr: tcp/10.233.96.114:8300) (DC: bn)
2020/01/21 13:29:41 [INFO] serf: EventMemberJoin: hashicorp-consul-server-0 10.233.92.207
2020/01/21 13:29:41 [INFO] consul: adding server hashicorp-consul-server-0 (Addr: tcp/10.233.92.207:8300) (DC: bn)
2020/01/21 13:29:41 [INFO] serf: EventMemberJoin: hashicorp-consul-server-1 10.233.90.111
2020/01/21 13:29:41 [INFO] consul: adding server hashicorp-consul-server-1 (Addr: tcp/10.233.90.111:8300) (DC: bn)
2020/01/21 13:29:41 [INFO] serf: EventMemberJoin: node2 10.233.96.113
2020/01/21 13:29:41 [WARN] memberlist: Refuting a suspect message (from: node3)
2020/01/21 13:29:41 [INFO] serf: EventMemberJoin: node1 10.233.90.112
2020/01/21 13:29:41 [INFO] agent: (LAN) joined: 3
2020/01/21 13:29:41 [INFO] agent: Join LAN completed. Synced with 3 initial agents
2020/01/21 13:29:43 [ERR] consul: "Catalog.Register" RPC failed to server 10.233.96.114:8300: rpc error making call: rpc error making call: failed inserting node: Error while renaming Node ID: "b3d97f19-ebe0-f5da-d4b1-20145eede1b3": Node name node3 is reserved by node 5ed969fe-9b59-d190-92dd-9e0c8d91574d with name node3
2020/01/21 13:29:43 [WARN] agent: Syncing node info failed. rpc error making call: rpc error making call: failed inserting node: Error while renaming Node ID: "b3d97f19-ebe0-f5da-d4b1-20145eede1b3": Node name node3 is reserved by node 5ed969fe-9b59-d190-92dd-9e0c8d91574d with name node3
2020/01/21 13:29:43 [ERR] agent: failed to sync remote state: rpc error making call: rpc error making call: failed inserting node: Error while renaming Node ID: "b3d97f19-ebe0-f5da-d4b1-20145eede1b3": Node name node3 is reserved by node 5ed969fe-9b59-d190-92dd-9e0c8d91574d with name node3
2020/01/21 13:29:45 [INFO] agent: Deregistered service "counting-proxy"
2020/01/21 13:29:45 [INFO] agent: Deregistered check "counting-proxy-ttl"
2020/01/21 13:29:45 [ERR] consul: "Catalog.Register" RPC failed to server 10.233.92.207:8300: rpc error making call: failed inserting node: Error while renaming Node ID: "b3d97f19-ebe0-f5da-d4b1-20145eede1b3": Node name node3 is reserved by node 5ed969fe-9b59-d190-92dd-9e0c8d91574d with name node3
2020/01/21 13:29:45 [WARN] agent: Syncing node info failed. rpc error making call: failed inserting node: Error while renaming Node ID: "b3d97f19-ebe0-f5da-d4b1-20145eede1b3": Node name node3 is reserved by node 5ed969fe-9b59-d190-92dd-9e0c8d91574d with name node3
2020/01/21 13:29:45 [ERR] agent: failed to sync remote state: rpc error making call: failed inserting node: Error while renaming Node ID: "b3d97f19-ebe0-f5da-d4b1-20145eede1b3": Node name node3 is reserved by node 5ed969fe-9b59-d190-92dd-9e0c8d91574d with name node3
2020/01/21 13:29:45 [INFO] serf: EventMemberFailed: node2 10.233.96.113
2020/01/21 13:29:48 [INFO] serf: EventMemberLeave: node1 10.233.90.112
2020/01/21 13:29:53 [INFO] serf: EventMemberLeave: hashicorp-consul-server-1 10.233.90.111
2020/01/21 13:29:53 [INFO] consul: removing server hashicorp-consul-server-1 (Addr: tcp/10.233.90.111:8300) (DC: bn)
==> Newer Consul version available: 1.6.2 (currently running: 1.5.3)
2020/01/21 13:30:07 [INFO] agent: Synced node info
2020/01/21 13:30:10 [ERR] memberlist: Conflicting address for hashicorp-consul-server-1. Mine: 10.233.90.111:8301 Theirs: 10.233.90.114:8301 Old state: 2
2020/01/21 13:30:10 [WARN] serf: Name conflict for 'hashicorp-consul-server-1' both 10.233.90.111:8301 and 10.233.90.114:8301 are claiming
2020/01/21 13:30:10 [ERR] memberlist: Conflicting address for hashicorp-consul-server-1. Mine: 10.233.90.111:8301 Theirs: 10.233.90.114:8301 Old state: 2
2020/01/21 13:30:10 [WARN] serf: Name conflict for 'hashicorp-consul-server-1' both 10.233.90.111:8301 and 10.233.90.114:8301 are claiming
2020/01/21 13:30:11 [INFO] serf: attempting reconnect to node2 10.233.96.113:8301
2020/01/21 13:30:29 [INFO] serf: EventMemberLeave (forced): node2 10.233.96.113
2020/01/21 13:30:50 [ERR] consul: "ConnectCA.Roots" RPC failed to server 10.233.96.114:8300: rpc error making call: rpc error making call: EOF
2020/01/21 13:30:50 [ERR] consul: "Catalog.Register" RPC failed to server 10.233.96.114:8300: rpc error making call: rpc error making call: EOF
2020/01/21 13:30:50 [WARN] agent: Syncing service "counting-proxy" failed. rpc error making call: rpc error making call: EOF
2020/01/21 13:30:50 [ERR] consul: "Intention.Match" RPC failed to server 10.233.96.114:8300: rpc error making call: rpc error making call: EOF
2020/01/21 13:30:50 [ERR] roots watch error: invalid type for roots response:
4 & 5. output of consul members from server & client
root@node1:~# k exec -it hashicorp-consul-server-0 consul members
Node Address Status Type Build Protocol DC Segment
hashicorp-consul-server-0 10.233.92.211:8301 alive server 1.5.3 2 bn
any update
Hi @srimandarbha, sorry, it appears we missed responding to this. Did you ever get it working?
Hi, we're going to close this because we haven't heard back in a while. If you do get back to us we're happy to re-open!
Hello
I have setup 3 server consul pods on a three node kubernetes setup and i am able to list all the servers & clients from "consul members" output, but i am not able to see all the nodes info from consul UI (using port-forward on consul svc to check the consul UI). Is this expected or am i missing anything
Thanks Sriman