Open Montbuet opened 3 years ago
Hi @Montbuet I'm going to transfer this over to our consul-k8s repo. Could you provide steps on how to reproduce the problem you are seeing? If you could provide a Helm config yaml file and some steps on reproducing that would be very helpful.
Hi @david-yu ,
Helm config file:
consul:
global:
enabled: true
name: "consul"
datacenter: "K8S"
tls:
enabled: false
httpsOnly: false
client:
enabled: true
securityContext:
runAsNonRoot: false
runAsGroup: 0
runAsUser: 0
fsGroup: 0
server:
replicas: 3
bootstrap_expect: 3
enabled: true
securityContext:
runAsNonRoot: false
runAsGroup: 0
runAsUser: 0
fsGroup: 0
ui:
enabled: true
Reproduce steps:
What to do to:
If you have any clue on how to do that, I would be very grateful.
Hey @Montbuet
What is the use case for that? Are you thinking of a disaster case when all pods go down? Right now, Consul (or rather Raft) cannot handle IP address change when the cluster does not have a quorum.
It looks like there was already some discussion on changing raft to support DNS addresses or in general accommodate the use case when the entire cluster goes down, but I don't think this was addressed yet.
@david-yu If the request is to make Consul support IP address change for the entire cluster, then I think this issue is more appropriate in hashicorp/consul as consul-helm behaves as expected.
Hey @Montbuet
What is the use case for that? Are you thinking of a disaster case when all pods go down? Right now, Consul (or rather Raft) cannot handle IP address change when the cluster does not have a quorum.
It looks like there was already some discussion on changing raft to support DNS addresses or in general accommodate the use case when the entire cluster goes down, but I don't think this was addressed yet.
@david-yu If the request is to make Consul support IP address change for the entire cluster, then I think this issue is more appropriate in hashicorp/consul as consul-helm behaves as expected.
Hey and thank you for your answer. Yes, this is precisely for a disaster recovery use case. Doing it by hand is time consuming, and in case something bad happen, it would be great that everything starts up without human intervention,
Makes sense. I'll transfer it back to Consul so that it can be better tracked there. Sorry for all the back and forth!
For now, I think your best bet is to work around this manually or with scripting as this is likely a more involved change.
@Montbuet : I've attempted to reword the title to reflect the enhancement that you're requesting. Please correct if needed!
Hello,
I spent some time to investigate the issue, and noticed something weird: Consul members
command returns the good consul-server IPs but in the logs, the olds IPs are still used.
user@user:~$ kubectl -n vault get pods -o wide -w
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
consul-5tg6h 0/1 Running 0 64m 172.19.43.81 k8s-worker-003 <none> <none>
consul-br62r 0/1 Running 0 64m 172.19.52.222 k8s-worker-006 <none> <none>
consul-h9flv 0/1 Running 0 64m 172.19.58.141 k8s-worker-001 <none> <none>
consul-jpf7h 0/1 Running 0 64m 172.19.32.114 k8s-worker-010 <none> <none>
consul-nhtzd 0/1 Running 0 64m 172.19.43.227 k8s-worker-009 <none> <none>
consul-qwsqn 0/1 Running 0 64m 172.19.49.27 k8s-worker-005 <none> <none>
consul-server-0 0/1 Running 0 4m7s 172.19.44.132 k8s-worker-003 <none> <none>
consul-server-1 0/1 Running 0 4m33s 172.19.46.140 k8s-worker-002 <none> <none>
consul-server-2 0/1 Running 0 3m57s 172.19.59.33 k8s-worker-001 <none> <none>
consul-sv2nf 0/1 Running 0 64m 172.19.42.200 k8s-worker-008 <none> <none>
consul-wdf5p 0/1 Running 0 64m 172.19.43.153 k8s-worker-004 <none> <none>
consul-wwdfr 0/1 Running 0 64m 172.19.46.135 k8s-worker-002 <none> <none>
consul-xv2gk 0/1 Running 0 64m 172.19.44.3 k8s-worker-007 <none> <none>
Here are the corrects servers IPs: consul-server-0 172.19.44.132, consul-server-1 172.19.46.140, consul-server-2 172.19.59.33.
Then let see what returns the consul members
command:
user@user:~$ kubectl -n vault exec -it consul-server-0 -- bin/sh`
/ # consul members
Node Address Status Type Build Protocol DC Segment
consul-server-0 172.19.44.132:8301 alive server 1.10.0 2 k8s <all>
consul-server-1 172.19.46.140:8301 alive server 1.10.0 2 k8s <all>
consul-server-2 172.19.59.33:8301 alive server 1.10.0 2 k8s <all>
k8s-worker-001 172.19.58.141:8301 alive client 1.10.0 2 k8s <default>
k8s-worker-002 172.19.46.135:8301 alive client 1.10.0 2 k8s <default>
k8s-worker-003 172.19.43.81:8301 alive client 1.10.0 2 k8s <default>
k8s-worker-004 172.19.43.153:8301 alive client 1.10.0 2 k8s <default>
k8s-worker-005 172.19.49.27:8301 alive client 1.10.0 2 k8s <default>
k8s-worker-006 172.19.52.222:8301 alive client 1.10.0 2 k8s <default>
k8s-worker-007 172.19.44.3:8301 alive client 1.10.0 2 k8s <default>
k8s-worker-008 172.19.42.200:8301 alive client 1.10.0 2 k8s <default>
k8s-worker-009 172.19.43.227:8301 alive client 1.10.0 2 k8s <default>
k8s-worker-010 172.19.32.114:8301 alive client 1.10.0 2 k8s <default>
The IPs of consul-server-0, consul-server-1 and consul-server-2 pods match the correct ones, but here is something weird:
user@user:~$ kubectl -n vault logs consul-server-0 | tail
2021-07-27T09:09:46.064Z [WARN] agent: Syncing node info failed.: error="No cluster leader"
2021-07-27T09:09:46.065Z [ERROR] agent.anti_entropy: failed to sync remote state: error="No cluster leader"
2021-07-27T09:09:50.469Z [INFO] agent.server.raft: duplicate requestVote for same term: term=581
2021-07-27T09:09:51.349Z [WARN] agent.server.raft: rejecting vote request since our last term is greater: candidate=172.19.59.33:8300 last-term=382 last-candidate-term=9
2021-07-27T09:09:51.349Z [INFO] agent.server.raft: entering follower state: follower="Node at 172.19.44.132:8300 [Follower]" leader=
2021-07-27T09:09:54.321Z [ERROR] agent.server.raft: failed to make requestVote RPC: target="{Voter 6deb73b7-e732-ee1b-2e98-9535ea4bd8f1 172.19.44.160:8300}" error="dial tcp <nil>->172.19.44.160:8300: i/o timeout"
2021-07-27T09:09:59.472Z [WARN] agent.server.raft: heartbeat timeout reached, starting election: last-leader=
2021-07-27T09:09:59.472Z [INFO] agent.server.raft: entering candidate state: node="Node at 172.19.44.132:8300 [Candidate]" term=583
2021-07-27T09:09:59.475Z [WARN] agent.server.raft: unable to get address for server, using fallback address: id=6deb73b7-e732-ee1b-2e98-9535ea4bd8f1 fallback=172.19.44.160:8300 error="Could not find address for server id 6deb73b7-e732-ee1b-2e98-9535ea4bd8f1"
2021-07-27T09:10:00.457Z [INFO] agent.server.raft: duplicate requestVote for same term: term=583
Full Log: https://pastebin.com/8ZP3zfSZ
EDIT: I think I found where does the issue come from: My consul cluster is supported by 3 PVC: One is mapped on a NFS volume (consul-0), and the two other are a local volume (consul-1/ and consul-2/ for instance) on the running node.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: consul
provisioner: kubernetes.io/no-provisioner
mountOptions:
- uid=1000
- gid=1000
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: consul-pv-volume-0
labels:
type: local
spec:
storageClassName: consul
capacity:
storage: 15Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "{{ $.Values.global.nfs_storage }}consul-0"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-vault-consul-server-0
labels:
app: consul-storage-claim
spec:
storageClassName: consul
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: consul-pv-volume-1
labels:
type: local
spec:
storageClassName: consul
capacity:
storage: 15Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/opt/storage/consul-1"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-vault-consul-server-1
labels:
app: vault-storage-claim
spec:
storageClassName: consul
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: consul-pv-volume-2
labels:
type: local
spec:
storageClassName: consul
capacity:
storage: 15Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/opt/storage/consul-2"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-vault-consul-server-2
labels:
app: consul-storage-claim
spec:
storageClassName: consul
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
When consul-1 or consul-2 pod is deleted and is created again on a different node, /opt/storage/consul-1/ or consul-2/ is created again on this local node. In this directory, a new node ID is created. So what I understand is: Their is a mismatch between old and new node-ID, and the consul cluster does not want to start without joining the old ones.
Workaround: Consul pod has to use volume containing an existing node-id file. What I did is just writing consul-1 and consul-2 node-id (no need for consul-0, because it runs on NFS) on every consul-1/ and consul-2/ local volumes on every node in my k8s cluster. Right now I am working on a better fix.
Hello,
I have a consul cluster deployed in a kubernetes cluster. When stopping the Consul cluster and starting it again, every pod has a new IP address, and the raft election fails:
[ERROR] agent.server.raft: failed to make requestVote RPC: target="{Voter b91......... 172.XX.XX.XX.:8300}" error="dial tcp->72.XX.XX.XX.:8300: i/o timeout"
(172.XX.XX.XX is assign to no one)
The only thing I can do is to delete every volume and restore a snapshot file, and everything is good again after that. But it is obviously time consuming.
How can I tell consul that the IP address of each pod changes at every restart ? Is there anything I can write in my values helm chart to do that ?
Ty and have a nice day