hashicorp / consul-helm

Helm chart to install Consul and other associated components.
Mozilla Public License 2.0
419 stars 386 forks source link

Run Consul Client on Kubernetes (EKS) only in client mode, Joining an Existing Consul Cluster on AWS #90

Closed moris1amar closed 4 years ago

moris1amar commented 5 years ago

Hi, I want to deploy a consul agent on Kubernetes (EKS Cluster) in node level, so I expect that all pods on node will run this agent and join an external consul server running on AWS, including DNS and service. Thus, I have overridden the values.yaml file only for client enable like that:

_global:
  enabled: false
  domain: consul
  image: "consul:1.4.0"
  datacenter: dc1

server:
  enabled: "-"
  image: null
  replicas: 3
  bootstrapExpect: 3 # Should <= replicas count
  storage: 10Gi
  connect: true
  resources: {}
  updatePartition: 0
    enabled: true
    maxUnavailable: null
  extraConfig: |
    {}
  extraVolumes: []

client:
  enabled: "true"
  grpc: true
  join:
    - "provider=aws tag_key=Name tag_value=ConsulMaster region=us-east-1"

  resources: {}
  extraConfig: |
    {}
  extraVolumes: []

dns:
  enabled: "true"

ui:
  enabled: "true"

  service:
    enabled: false
    type: null

connectInject:
  enabled: false # "-" disable this by default for now until the image is public
  image: "TODO"
  default: false # true will inject by default, otherwise requires annotation
  caBundle: "" # empty will auto generate the bundle
  namespaceSelector: null

  certs:
    secretName: null
    caBundle: ""
    certName: tls.crt
    keyName: tls.key_

and this is the result:

**helm install -f values.yaml --name consul6 ./consul-helm --namespace default**
NAME:   consul6
LAST DEPLOYED: Thu Dec 27 11:39:34 2018
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/ConfigMap
NAME                   DATA  AGE
consul6-client-config  1     2s

==> v1/Service
NAME         TYPE       CLUSTER-IP    EXTERNAL-IP  PORT(S)        AGE
consul6-dns  ClusterIP  10.100.25.33  <none>       53/TCP,53/UDP  2s

==> v1/DaemonSet
NAME     DESIRED  CURRENT  READY  UP-TO-DATE  AVAILABLE  NODE SELECTOR  AGE
consul6  0        0        0      0           0          <none>         2s

**kubectl -n default get pods -o wide**
NAME                          READY     STATUS      RESTARTS   AGE       IP              NODE                          NOMINATED NODE
consul-example                0/1       Completed   0          21h       172.31.3.88     ip-172-31-0-49.ec2.internal   <none>
hello-node-546b66f89f-2zmfs   1/1       Running     0          1d        172.31.12.218   ip-172-31-0-49.ec2.internal   <none>
hello-node-546b66f89f-5kqxd   1/1       Running     0          1d        172.31.8.62     ip-172-31-0-49.ec2.internal   <none>
hello-node-546b66f89f-8sb64   1/1       Running     0          1d        172.31.2.239    ip-172-31-0-49.ec2.internal   <none>
hello-node-546b66f89f-k2snh   1/1       Running     0          1d        172.31.6.233    ip-172-31-0-49.ec2.internal   <none>

**kubectl -n default describe pod hello-node-546b66f89f-2zmfs**
Name:               hello-node-546b66f89f-2zmfs
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               ip-172-31-0-49.ec2.internal/172.31.0.49
Start Time:         Wed, 26 Dec 2018 11:21:20 +0200
Labels:             app=hello-node
                    pod-template-hash=1026229459
Annotations:        <none>
Status:             Running
IP:                 172.31.12.218
Controlled By:      ReplicaSet/hello-node-546b66f89f
Containers:
  hello-node:
    Container ID:   docker://94190766a9e8ed593324c8fc6c771a2d63ecba5a4653963100960c5eb87e9619
    Image:          gcr.io/hello-minikube-zero-install/hello-node
    Image ID:       docker-pullable://gcr.io/hello-minikube-zero-install/hello-node@sha256:9cf82733f7278ae7ae899d432f8d3b3bb0fcb54e673c67496a9f76bb58f30a1c
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Wed, 26 Dec 2018 11:21:21 +0200
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-vwrkt (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-vwrkt:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-vwrkt
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

I note that the pods are not recognize by my external consul master.

What could be the reason?

Thanks, Maurice

JoshuaEdwards1991 commented 5 years ago

Did you figure this one out @moris1amar ?

stvdilln commented 4 years ago

I am using the consul-helm chart to apply ONLY the client to a k8s cluster. The Servers are running on stand alone VMs, I’m just trying to add one consul client per k8s node. I cannot get bi-directional traffic flowing: I can get the new consul clients to register and join the cluster, but the cluster cannot talk back to the new consul clients.

My Existing Consul Clusters are 3 VMs at 10.0.0.4,10.0.0.5 and 10.0.0.6. My Kubernetes Nodes are: 10.1.48.4,10.1.48.5 and 10.1.48.6 The Kubernetes pods addresses are 10.254.2.*

The new k8s conus client pod can connect 10.254.2.* --> 10.0.0.4 (existing Server VM) and register Shortly afterward I get error logs flooded with "Refuting a suspect message” and the new client starts to flag offline and on. The consul servers are trying to contact the new consul clients and cannot make a connection.

The Consul server cluster 10.0.0.4 cannot talk to the pod address range (in kubernetes) of 10.254.2.* , there needs to be an ingress rule for this to happen as POD IPs in my situation are not directly addressable on the network.

The Helm chart always assigns -advertise as the POD_Address, which in my case is not directly inward routable. The new consul clients are advertising a POD 10.254.2.* not a ‘real’ ip address. I don’t see any support for changing the -advertise parameter, nor do I see signs of the chart being able to create ingress rules for me.

Am I going up the wrong tree with consul helm chart and trying to have agents talk to external consul servers?

It would appear that adding hostPort and changing the --advertise from POD_ADDR to HOST_ADDR would fix the problem, and I might try it on my own chart.

lkysow commented 4 years ago

Hi Everyone, Using external servers is not well supported in the helm chart right now. You have to make some manual patches to the chart and I haven't had a chance to test this out so there may be other issues:

  1. You'll need to ensure that the Consul Clients in the daemonset are registering with the Node IP rather than the Pod IP and you'll need each Kube node to be routable from your Consul servers. You have to edit the chart yourself right now because this is not configurable

    https://github.com/hashicorp/consul-helm/blob/223e90412bc716e712909069d0228b2cd33222d3/templates/client-daemonset.yaml#L105

  2. You'll also need to ensure the client's port 8301 is a hostPort because that's what the servers will communicate with them on (see https://www.consul.io/docs/internals/architecture.html#10-000-foot-view)

stvdilln commented 4 years ago

I have it working (for me at least). --advertise set to HOST_IP and 8301 and 8302 set to hostport. Nodes are joining and staying connected. I can run the tests and do a PR if you would like.

lkysow commented 4 years ago

Nice! Yeah that would be great.

lkysow commented 4 years ago

I'm going to track this in https://github.com/hashicorp/consul-helm/issues/253 so please use that issue, thanks!