Servers go through rolling-deploy for config changes on Consul servers

Hello,

I have been trying out multiple deployment modes for Consul. I have sucessfully deployed a multi-DC via WAN federation using autoscaling groups in AWS, and now I’m moving to try out deploying on top of K8s. The experience was pretty smooth with ASG; in the end I got a pretty stable setup where I could add/remove nodes at will and the datacenters reacted smoothly (e.g, consensus was impeccable).

However, I keep having a lot of stability issues on the Raft consensus on top of K8s:

Question

increasing from 3 replicas to 5 replicas for some reason makes Raft lose consensus. I don’t undertstand this, why would Raft lose consensus when increasing the replicas?
with a 5-node deployment, changing a Consul setting (e.g., log rotation) and then upgrading helm once again makes the consensus to be lost.

One of the issues I believe I detected is that, because pods are named like consul-server-1, consul-server-2, etc, when pods are refreshing, they come up with the same name as the previous one (instead of generating a random suffix for example). This makes some members of the consensus protocol to detect two nodes with the same name (e.g., consul-server-2) running on different IPs (one from the new pod, and one from the old pod that was just replaced). But because this happens very quickly, the nodes don’t have time to “forget” the old pod causing a naming conflict.

Helm Configuration

global:
  enabled: true
  logLevel: "debug"
  logJSON: false
  name: "dlo"
  datacenter: "dlo"
  consulAPITimeout: "5s"
  enablePodSecurityPolicies: true
  recursors: []
  tls:
    enabled: true
    enableAutoEncrypt: true
    serverAdditionalDNSSANs: []
    serverAdditionalIPSANs: []
    verify: true
    httpsOnly: true
    caCert:
      secretName: null
      secretKey: null
    caKey:
      secretName: null
      secretKey: null
  acls:
    manageSystemACLs: true
    bootstrapToken:
      secretName: null
      secretKey: null
    createReplicationToken: true
    replicationToken:
      secretName: null
      secretKey: null
  gossipEncryption:
    autoGenerate: true
  federation:
    enabled: false
    createFederationSecret: false
    primaryDatacenter: null
    primaryGateways: []
    k8sAuthMethodHost: null
  metrics:
    enabled: false
    enableAgentMetrics: false
    agentMetricsRetentionTime: "1m"
    enableGatewayMetrics: true

server:
  replicas: 5
  #affinity: null # for minikube, set null
  connect: true # setup root CA and certificates
  extraConfig: |
    {
      "log_level": "DEBUG",
      "log_file": "/consul/",
      "log_rotate_duration": "24h",
      "log_rotate_max_files": 7
    }

client:
  enabled: false
  affinity: null
  updateStrategy: |
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
  extraConfig: |
    {
      "log_level": "DEBUG"
    }

ui:
  enabled: true
  service:
    enabled: true
    type: LoadBalancer
    port:
      http: 80
      https: 443
  metrics:
    enabled: false
  ingress:
    enabled: false

dns:
  enabled: false

externalServers:
  enabled: false

syncCatalog:
  enabled: false

connectInject:
  enabled: false

controller:
  enabled: false

meshGateway:
  enabled: false

ingressGateways:
  enabled: false

terminatingGateways:
  enabled: false

apiGateway:
  enabled: false

webhookCertManager:
  tolerations: null

prometheus:
  enabled: false

(I know most of the values there are the default ones, but I just wanted to have a yaml with the full configs so I could tweak incrementally)

Steps to reproduce this issue

helm install with 3 replicas and wait for healthy nodes
change config to 5 replicas and upgrade helm installation
consensus is lost and nodes take a long time (> 5 minutes) to reach consensus

Current understanding and Expected behavior

When adding nodes, consensus should not be lost
When changing nodes configurations, the pod replacement should be done carefully in order to keep consensus and avoid re-elections.

Environment details

I have tested this setup both in minikube and in AWS EKS, both with the same outcomes.

hashicorp / consul-k8s