I have been trying out multiple deployment modes for Consul. I have sucessfully deployed a multi-DC via WAN federation using autoscaling groups in AWS, and now I’m moving to try out deploying on top of K8s. The experience was pretty smooth with ASG; in the end I got a pretty stable setup where I could add/remove nodes at will and the datacenters reacted smoothly (e.g, consensus was impeccable).
However, I keep having a lot of stability issues on the Raft consensus on top of K8s:
Question
increasing from 3 replicas to 5 replicas for some reason makes Raft lose consensus. I don’t undertstand this, why would Raft lose consensus when increasing the replicas?
with a 5-node deployment, changing a Consul setting (e.g., log rotation) and then upgrading helm once again makes the consensus to be lost.
One of the issues I believe I detected is that, because pods are named like consul-server-1, consul-server-2, etc, when pods are refreshing, they come up with the same name as the previous one (instead of generating a random suffix for example). This makes some members of the consensus protocol to detect two nodes with the same name (e.g., consul-server-2) running on different IPs (one from the new pod, and one from the old pod that was just replaced). But because this happens very quickly, the nodes don’t have time to “forget” the old pod causing a naming conflict.
Other more general questions:
how can I automate the servers upgrade process without downtime? Official documentation mentions that I should manipulate the server.partitions setting in multiple phases. However, in a “real” scenario, in which deploys are managed by CI/CD tools, does it mean I need to do multiple commits and multiple deploys to ensure the servers all receive the upgrade? It does sound a bit unproductive. Are there any other alternatives to this, while still using the official Helm chart?
the helm chart is using deprecated settings, both from Consul as well as K8s settings (e.g., TLS settings and PodSecurityPolicy). Is this a known issue?
I’m probably missing something that could explain these issues, as I am fairly new working with K8s.
Hello,
I have been trying out multiple deployment modes for Consul. I have sucessfully deployed a multi-DC via WAN federation using autoscaling groups in AWS, and now I’m moving to try out deploying on top of K8s. The experience was pretty smooth with ASG; in the end I got a pretty stable setup where I could add/remove nodes at will and the datacenters reacted smoothly (e.g, consensus was impeccable).
However, I keep having a lot of stability issues on the Raft consensus on top of K8s:
Question
One of the issues I believe I detected is that, because pods are named like consul-server-1, consul-server-2, etc, when pods are refreshing, they come up with the same name as the previous one (instead of generating a random suffix for example). This makes some members of the consensus protocol to detect two nodes with the same name (e.g., consul-server-2) running on different IPs (one from the new pod, and one from the old pod that was just replaced). But because this happens very quickly, the nodes don’t have time to “forget” the old pod causing a naming conflict.
Other more general questions:
server.partitions
setting in multiple phases. However, in a “real” scenario, in which deploys are managed by CI/CD tools, does it mean I need to do multiple commits and multiple deploys to ensure the servers all receive the upgrade? It does sound a bit unproductive. Are there any other alternatives to this, while still using the official Helm chart?I’m probably missing something that could explain these issues, as I am fairly new working with K8s.
Helm Configuration
(I know most of the values there are the default ones, but I just wanted to have a yaml with the full configs so I could tweak incrementally)
Steps to reproduce this issue
Current understanding and Expected behavior
Environment details
I have tested this setup both in minikube and in AWS EKS, both with the same outcomes.