hashicorp / consul-k8s

First-class support for Consul Service Mesh on Kubernetes
https://www.consul.io/docs/k8s
Mozilla Public License 2.0
667 stars 316 forks source link

Consul dataplane container args #3201

Open andriktr opened 10 months ago

andriktr commented 10 months ago

Hi, Since consul versio 1.14.x a there is no more agent in consul k8s. In our case we run consul servers outside the k8s cluster. I noticed that dataplane container (application sidecar) has the following args

...
  - args:
    - -addresses
    - 10.162.34.52
    - -grpc-port=8503
    - -proxy-service-id-path=/consul/connect-inject/proxyid
    - -log-level=info
    - -log-json=false
    - -envoy-concurrency=2
    - -credential-type=login
    - -login-auth-method=consul-experimental-k8s-auth-method
    - -login-bearer-token-path=/var/run/secrets/kubernetes.io/serviceaccount/token
    - -ca-certs=/consul/connect-inject/consul-ca.pem
    - -graceful-port=20600
    - -shutdown-drain-listeners
    - -shutdown-grace-period-seconds=30
    - -graceful-shutdown-path=/graceful_shutdown
    - -telemetry-prom-scrape-path=/metrics

as u can see addresses arg has only one ip address. I assume that it simply takes first IP from helm values:

...
externalServers:
  # If true, the Helm chart will be configured to talk to the external servers.
  # If setting this to true, you must also set `server.enabled` to false.
  enabled: true

  # An array of external Consul server hosts that are used to make
  # HTTPS connections from the components in this Helm chart.
  # Valid values include an IP, a DNS name, or an [exec=](https://github.com/hashicorp/go-netaddrs) string.
  # The port must be provided separately below.
  # Note: This slice can only contain a single element.
  # Note: If enabling clients, `client.join` must also be set to the hosts that should be
  # used to join the cluster. In most cases, the `client.join` values
  # should be the same, however, they may be different if you
  # wish to use separate hosts for the HTTPS connections.
  # @type: array<string>
  hosts: ["10.162.34.52","10.162.34.53","10.162.34.54"]

My question is what will happen if cluster nodes defined in dataplane arg will fail. How this will be handled and what impact it will have for the meshed application?

Maybe then better add cluster nodes behind loadbalancer and use it's IP in external servers hosts?

Thank you.

komapa commented 9 months ago

This is kind of "subtle" but the answer is in the comment above

Note: This slice can only contain a single element.

So you should have this be a hostname or some anycast IP, depending on your setup.

andriktr commented 9 months ago

@komapa Didn't catch your answer to be honest... It's has type array that means you can put several ip's or hostname.

komapa commented 9 months ago

@komapa Didn't catch your answer to be honest... It's has type array that means you can put several ip's or hostname.

It is an array (slice) but you can only put one element in it as you saw in practice.

andriktr commented 9 months ago

Actually u can put here an array of host names or ip's, but it will pickup the first one. Anyway consul-datplane container recognises remaining servers in consul cluster and starts to watch them as well:

2023-12-16T20:13:17.824Z [INFO]  consul-dataplane.server-connection-manager: connected to Consul server: address=10.162.34.52:8503
2023-12-16T20:13:17.824Z [INFO]  consul-dataplane: connected to Consul server over gRPC: initial_server_address=10.162.34.52:8503
2023-12-16T20:13:17.824Z [INFO]  consul-dataplane: starting envoy xDS server: address=127.0.0.1:44969
2023-12-16T20:13:17.827Z [INFO]  consul-dataplane: dns proxy disabled: configure the Consul DNS port to enable
2023-12-16T20:13:17.828Z [INFO]  consul-dataplane.server-connection-manager: updated known Consul servers from watch stream: addresses=[10.162.34.53:8503, 10.162.34.54:8503, 10.162.34.52:8503]

The best solution here is to put consul cluster servers behind LB and use LB address. As usually LB has a health probe request will always be passed to the healthy node.