Consul fails to start - Githubissues

mmisztal1980 commented 6 years ago

In my cluster I have rook running. When provisioning the consul cluster, my helm values file looks like this:

global:
  enabled: true
  domain: consul
  image: "consul:1.2.3"
  datacenter: dc1

server:
  enabled: "-"
  image: null
  replicas: 3
  bootstrapExpect: 3
  storage: 10Gi
  storageClass: rook-ceph-block
  connect: true
  resources: {}
  updatePartition: 0
  disruptionBudget:
    enabled: true
    maxUnavailable: null
  extraConfig: |
    {}
  extraVolumes: []
client:
  enabled: "-"
  image: null
  join: null
  resources: {}
  extraConfig: |
    {}
  extraVolumes: []
dns:
  enabled: "-"

ui:
  enabled: "-"
  service:
    enabled: true
    type: null

connectInject:
  enabled: false # "-" disable this by default for now until the image is public
  image: "TODO"
  default: false # true will inject by default, otherwise requires annotation
  caBundle: "" # empty will auto generate the bundle
  namespaceSelector: null

  certs:
    secretName: null
    caBundle: ""
    certName: tls.crt
    keyName: tls.key

In order to start the chart, I use the following cmdline:

helm install -f ./helm/values.digitalocean.yaml --name consul --namespace service-discovery ./helm
NAME:   consul
LAST DEPLOYED: Wed Sep 26 07:46:32 2018
NAMESPACE: service-discovery
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/PodDisruptionBudget
NAME           MIN AVAILABLE  MAX UNAVAILABLE  ALLOWED DISRUPTIONS  AGE
consul-server  N/A            0                0                    1s

==> v1/Pod(related)
NAME             READY  STATUS             RESTARTS  AGE
consul-56lvs     0/1    ContainerCreating  0         1s
consul-jttwp     0/1    ContainerCreating  0         1s
consul-qpgdn     0/1    ContainerCreating  0         1s
consul-server-0  0/1    Pending            0         1s
consul-server-1  0/1    Pending            0         1s
consul-server-2  0/1    Pending            0         1s

==> v1/ConfigMap
NAME                  DATA  AGE
consul-client-config  1     1s
consul-server-config  1     1s

==> v1/Service
NAME           TYPE       CLUSTER-IP    EXTERNAL-IP  PORT(S)                                                                  AGE
consul-dns     ClusterIP  10.3.169.187  <none>       53/TCP,53/UDP                                                            1s
consul-server  ClusterIP  None          <none>       8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP  1s
consul-ui      ClusterIP  10.3.20.200   <none>       80/TCP                                                                   1s

==> v1/DaemonSet
NAME    DESIRED  CURRENT  READY  UP-TO-DATE  AVAILABLE  NODE SELECTOR  AGE
consul  3        3        0      3           0          <none>         1s

==> v1/StatefulSet
NAME           DESIRED  CURRENT  AGE
consul-server  3        3        1s

A quick verification of pvc indicate that they have bound successfully, please note that I've observed that it takes up to 20s to bind the pvc(s) running under rook

consul-qpgdn     0/1    ContainerCreating  0         1s
consul-server-0  0/1    Pending            0         1s
consul-server-1  0/1    Pending            0         1s
consul-server-2  0/1    Pending            0         1s

==> v1/ConfigMap
NAME                  DATA  AGE
consul-client-config  1     1s
consul-server-config  1     1s

==> v1/Service
NAME           TYPE       CLUSTER-IP    EXTERNAL-IP  PORT(S)                                                                  AGE
consul-dns     ClusterIP  10.3.169.187  <none>       53/TCP,53/UDP                                                            1s
consul-server  ClusterIP  None          <none>       8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP  1s
consul-ui      ClusterIP  10.3.20.200   <none>       80/TCP                                                                   1s

==> v1/DaemonSet
NAME    DESIRED  CURRENT  READY  UP-TO-DATE  AVAILABLE  NODE SELECTOR  AGE
consul  3        3        0      3           0          <none>         1s

==> v1/StatefulSet
NAME           DESIRED  CURRENT  AGE
consul-server  3        3        1s

 mmisztal@tsunami  ~/Projects/@cloud-technologies/ops-k8s-services-inf/src/consul   master ●  kubectl get pvc --all-namespaces
NAMESPACE           NAME                   STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
service-discovery   data-consul-server-0   Bound     pvc-85de967e-c14f-11e8-8c17-c6926976ea12   10Gi       RWO            rook-ceph-block   13s
service-discovery   data-consul-server-1   Bound     pvc-85e4d3d0-c14f-11e8-8c17-c6926976ea12   10Gi       RWO            rook-ceph-block   13s
service-discovery   data-consul-server-2   Bound     pvc-85ea1341-c14f-11e8-8c17-c6926976ea12   10Gi       RWO            rook-ceph-block   13s

However the consul-server pods have failed to start:

 kubectl -n service-discovery get pods
NAME              READY     STATUS              RESTARTS   AGE
consul-56lvs      0/1       Running             0          36s
consul-jttwp      0/1       Running             0          36s
consul-qpgdn      0/1       Running             0          36s
consul-server-0   0/1       ContainerCreating   0          36s
consul-server-1   0/1       ContainerCreating   0          36s
consul-server-2   0/1       Running             0          36s

An examination of the pod indicates that the readiness probe has failed:

kubectl -n service-discovery describe pod consul-server-2
Name:           consul-server-2
Namespace:      service-discovery
Node:           k8s-node-1.cloud-technologies.net/142.93.131.205
Start Time:     Wed, 26 Sep 2018 07:47:42 +0200
Labels:         app=consul
                chart=consul-0.1.0
                component=server
                controller-revision-hash=consul-server-66479c5df5
                hasDNS=true
                release=consul
                statefulset.kubernetes.io/pod-name=consul-server-2
Annotations:    consul.hashicorp.com/connect-inject=false
Status:         Running
IP:             10.2.1.13
Controlled By:  StatefulSet/consul-server
Containers:
  consul:
    Container ID:  docker://09641e9f57faf0304b9e74818ee99f2a7dd23a4f1bc44fa6e426ad4be2d72578
    Image:         consul:1.2.3
    Image ID:      docker-pullable://consul@sha256:ea66d17d8c8c1f1afb2138528d62a917093fcd2e3b3a7b216a52c253189ea980
    Ports:         8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
    Command:
      /bin/sh
      -ec
      CONSUL_FULLNAME="consul"

exec /bin/consul agent \
  -advertise="${POD_IP}" \
  -bind=0.0.0.0 \
  -bootstrap-expect=3 \
  -client=0.0.0.0 \
  -config-dir=/consul/config \
  -datacenter=dc1 \
  -data-dir=/consul/data \
  -domain=consul \
  -hcl="connect { enabled = true }" \
  -ui \
  -retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
  -retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
  -retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
  -server

    State:          Running
      Started:      Wed, 26 Sep 2018 07:48:17 +0200
    Ready:          True
    Restart Count:  0
    Readiness:      exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=5s timeout=5s period=3s #success=1 #failure=2
    Environment:
      POD_IP:      (v1:status.podIP)
      NAMESPACE:  service-discovery (v1:metadata.namespace)
    Mounts:
      /consul/config from config (rw)
      /consul/data from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-js48j (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          True 
  PodScheduled   True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-consul-server-2
    ReadOnly:   false
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      consul-server-config
    Optional:  false
  default-token-js48j:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-js48j
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age               From                                        Message
  ----     ------                 ----              ----                                        -------
  Normal   Scheduled              10m               default-scheduler                           Successfully assigned consul-server-2 to k8s-node-1.cloud-technologies.net
  Normal   SuccessfulMountVolume  10m               kubelet, k8s-node-1.cloud-technologies.net  MountVolume.SetUp succeeded for volume "config"
  Normal   SuccessfulMountVolume  10m               kubelet, k8s-node-1.cloud-technologies.net  MountVolume.SetUp succeeded for volume "default-token-js48j"
  Warning  FailedMount            10m               kubelet, k8s-node-1.cloud-technologies.net  MountVolume.SetUp failed for volume "pvc-85ea1341-c14f-11e8-8c17-c6926976ea12" : invalid character '-' after top-level value
  Normal   SuccessfulMountVolume  10m               kubelet, k8s-node-1.cloud-technologies.net  MountVolume.SetUp succeeded for volume "pvc-85ea1341-c14f-11e8-8c17-c6926976ea12"
  Normal   Pulled                 10m               kubelet, k8s-node-1.cloud-technologies.net  Container image "consul:1.2.3" already present on machine
  Normal   Created                10m               kubelet, k8s-node-1.cloud-technologies.net  Created container
  Normal   Started                9m                kubelet, k8s-node-1.cloud-technologies.net  Started container
  Warning  Unhealthy              6m (x12 over 9m)  kubelet, k8s-node-1.cloud-technologies.net  Readiness probe failed:
 mmisztal@tsunami  ~/Projects/@cloud-technologies/ops-k8s-services-inf/src/consul   master ● 

An examination of the server logs indicates that it has failed to form the cluster:

 kubectl -n service-discovery logs consul-server-2
==> Starting Consul agent...
bootstrap_expect > 0: expecting 3 servers
==> Consul agent running!
           Version: 'v1.2.3'
           Node ID: '0a5797a9-fee3-d61c-61d8-01b500a9e3c8'
         Node name: 'consul-server-2'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: false)
       Client Addr: [0.0.0.0] (HTTP: 8500, HTTPS: -1, DNS: 8600)
      Cluster Addr: 10.2.1.13 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false

==> Log data will now stream in as it occurs:

    2018/09/26 05:48:17 [INFO] raft: Initial configuration (index=0): []
    2018/09/26 05:48:17 [INFO] raft: Node at 10.2.1.13:8300 [Follower] entering Follower state (Leader: "")
    2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-server-2.dc1 10.2.1.13
    2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-server-2 10.2.1.13
    2018/09/26 05:48:17 [INFO] agent: Started DNS server 0.0.0.0:8600 (udp)
    2018/09/26 05:48:17 [INFO] consul: Adding LAN server consul-server-2 (Addr: tcp/10.2.1.13:8300) (DC: dc1)
    2018/09/26 05:48:17 [INFO] consul: Handled member-join event for server "consul-server-2.dc1" in area "wan"
    2018/09/26 05:48:17 [WARN] agent/proxy: running as root, will not start managed proxies
    2018/09/26 05:48:17 [INFO] agent: Started DNS server 0.0.0.0:8600 (tcp)
    2018/09/26 05:48:17 [INFO] agent: Started HTTP server on [::]:8500 (tcp)
    2018/09/26 05:48:17 [INFO] agent: started state syncer
    2018/09/26 05:48:17 [INFO] agent: Retry join LAN is supported for: aliyun aws azure digitalocean gce k8s os packet scaleway softlayer triton vsphere
    2018/09/26 05:48:17 [INFO] agent: Joining LAN cluster...
    2018/09/26 05:48:17 [INFO] agent: (LAN) joining: [consul-server-0.consul-server.service-discovery.svc consul-server-1.consul-server.service-discovery.svc consul-server-2.consul-server.service-discovery.svc]
    2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-56lvs 10.2.3.10
    2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-jttwp 10.2.1.11
    2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-server-1 10.2.3.11
    2018/09/26 05:48:17 [INFO] consul: Adding LAN server consul-server-1 (Addr: tcp/10.2.3.11:8300) (DC: dc1)
    2018/09/26 05:48:17 [WARN] memberlist: Refuting a suspect message (from: consul-server-2.dc1)
    2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-server-1.dc1 10.2.3.11
    2018/09/26 05:48:17 [INFO] consul: Handled member-join event for server "consul-server-1.dc1" in area "wan"
    2018/09/26 05:48:17 [WARN] memberlist: Failed to resolve consul-server-2.consul-server.service-discovery.svc: lookup consul-server-2.consul-server.service-discovery.svc on 10.3.0.10:53: no such host
    2018/09/26 05:48:17 [INFO] agent: (LAN) joined: 1 Err: <nil>
    2018/09/26 05:48:17 [INFO] agent: Join LAN completed. Synced with 1 initial agents
    2018/09/26 05:48:24 [WARN] raft: no known peers, aborting election
    2018/09/26 05:48:25 [ERR] agent: failed to sync remote state: No cluster leader

Any hints what may be wrong? I've noticed that the probe's initialDelaySeconds default value is 5, so I'm guessing it may have failed before the pvcs have been bound? Perhaps it'd make sense to have this value configurable?

mitchellh commented 6 years ago

Does it never stabilize? It looks like its starting to work, just hasn't joined all the servers yet.

Also, PVCs might be a real issue for sure. I didn't realize actually that StatefulSets with PVCs can be started before the PVC is available (spoiled in the environment we run in I guess). Do you know what that looks like? Is the directory just not available yet? That might be something we have to build into an init container or something (to wait for it to be ready).

mmisztal1980 commented 6 years ago

@mitchellh no, it never does, funnily it used to work with the previous release of rook (v0.7).

In my understanding the StatefulSet starts and attempts to bind the PVCs, until that is done, the pod should report an unbound pvc issue - should be easily accessbile via kubectl describe or kubectl logs

mitchellh commented 6 years ago

But during that time, the containers are started?

Sorry, easiest way to figure this out would be if you did more digging or I can get a reproduction. For the latter, is there an easy way for me to get a similar environment up and running?

mmisztal1980 commented 6 years ago

They appear to be - the logs are there.

I'm happy to do more digging, however in order for you to get a repro, I'd have to provide you with terraform files, helm value files & k8s manifests to get a copy of my env going OR I'll simply share a kubeconfig file so that you can poke around and leave it running for the night

mmisztal1980 commented 6 years ago

@mitchellh I've cloned the repo and made some alterations: values.yaml server.storageclass: rook-ceph-block

server-statefulset.yaml readinessProbe.initialDelaySeconds: 60

Results

All consul-* pods were up and running v. fast, however they have not been ready for about 50s
All consul-server-* pods were not up and running immediately, they were stuck in ContainerCreating state for approx 50s when the 1st pod reported Running. None of the pods were Ready
Once all 3 consul-server pods were up and running the initialDelaySeconds period has passed and they started to pass the readiness test

helm status consul
LAST DEPLOYED: Wed Sep 26 23:34:25 2018
NAMESPACE: service-discovery
STATUS: DEPLOYED

RESOURCES:
==> v1/Pod(related)
NAME             READY  STATUS   RESTARTS  AGE
consul-kktl5     1/1    Running  0         2m
consul-tbjmt     1/1    Running  0         2m
consul-x5cqq     1/1    Running  0         2m
consul-server-0  1/1    Running  0         2m
consul-server-1  1/1    Running  0         2m
consul-server-2  1/1    Running  0         2m

==> v1/ConfigMap
NAME                  DATA  AGE
consul-client-config  1     2m
consul-server-config  1     2m

==> v1/Service
NAME           TYPE       CLUSTER-IP   EXTERNAL-IP  PORT(S)                                                                  AGE
consul-dns     ClusterIP  10.3.228.30  <none>       53/TCP,53/UDP                                                            2m
consul-server  ClusterIP  None         <none>       8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP  2m
consul-ui      ClusterIP  10.3.98.102  <none>       80/TCP                                                                   2m

==> v1/DaemonSet
NAME    DESIRED  CURRENT  READY  UP-TO-DATE  AVAILABLE  NODE SELECTOR  AGE
consul  3        3        3      3           3          <none>         2m

==> v1/StatefulSet
NAME           DESIRED  CURRENT  AGE
consul-server  3        3        2m

==> v1beta1/PodDisruptionBudget
NAME           MIN AVAILABLE  MAX UNAVAILABLE  ALLOWED DISRUPTIONS  AGE
consul-server  N/A            0                0                    2m

Compare to previous results

If you compare the previous results, the 0/1(s) mean that the pod is running, however the readiness test is failed.

NAME              READY     STATUS              RESTARTS   AGE
consul-56lvs      0/1       Running             0          36s
consul-jttwp      0/1       Running             0          36s
consul-qpgdn      0/1       Running             0          36s
consul-server-0   0/1       ContainerCreating   0          36s
consul-server-1   0/1       ContainerCreating   0          36s
consul-server-2   0/1       Running             0          36s

Hypothesis (FailureTreshold?)

This is what the docs say:

failureThreshold: When a Pod starts and the probe fails, Kubernetes will try failureThreshold times 
before giving up. Giving up in case of liveness probe means restarting the Pod. In case of readiness
probe the Pod will be marked Unready. Defaults to 3. Minimum value is 1.

The server-statefulset.yaml failure treshold is: failureThreshold: 2 and the initial delay is: initialDelaySeconds: 5

Meaning that kubernetes started to fail the checks and gave up before the pvc(s) were bound to the pods?

jmreicha commented 6 years ago

@mmisztal1980 I made the adjustments that you mentioned, but am still seeing pod failing health checks and servers sitting in a pending state.

consul             consul-dzjnx                              0/1     Running     0          4m
consul             consul-g8lmf                              0/1     Running     0          4m
consul             consul-kx8l6                              0/1     Running     0          4m
consul             consul-server-0                           0/1     Pending     0          4m
consul             consul-server-1                           0/1     Pending     0          4m
consul             consul-server-2                           0/1     Pending     0          4m

Consul logs seem to indicate that the stateful set isn't binding the PVC. I am also using Rook/Ceph for storage.

pod description

``` Name: consul-server-0 Namespace: consul Priority: 0 PriorityClassName: Node: Labels: app=consul chart=consul-0.1.0 component=server controller-revision-hash=consul-server-5cf54754b hasDNS=true release=consul statefulset.kubernetes.io/pod-name=consul-server-0 Annotations: consul.hashicorp.com/connect-inject: false Status: Pending IP: Controlled By: StatefulSet/consul-server Containers: consul: Image: consul:1.2.3 Ports: 8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/UDP Command: /bin/sh -ec CONSUL_FULLNAME="consul" exec /bin/consul agent \ -advertise="${POD_IP}" \ -bind=0.0.0.0 \ -bootstrap-expect=3 \ -client=0.0.0.0 \ -config-dir=/consul/config \ -datacenter=dc1 \ -data-dir=/consul/data \ -domain=consul \ -hcl="connect { enabled = true }" \ -ui \ -retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \ -retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \ -retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \ -server Readiness: exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \ grep -E '".+"' ] delay=60s timeout=5s period=3s #success=1 #failure=2 Environment: POD_IP: (v1:status.podIP) NAMESPACE: consul (v1:metadata.namespace) Mounts: /consul/config from config (rw) /consul/data from data (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-lv4v8 (ro) Conditions: Type Status PodScheduled False Volumes: data: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: data-consul-server-0 ReadOnly: false config: Type: ConfigMap (a volume populated by a ConfigMap) Name: consul-server-config Optional: false default-token-lv4v8: Type: Secret (a volume populated by a Secret) SecretName: default-token-lv4v8 Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 11s (x245 over 10m) default-scheduler pod has unbound PersistentVolumeClaims (repeated 3 times) ```

I checked the claims (kubectl get pvc) but don't see any claims made my Consul showing up in there.

jmreicha commented 6 years ago

Ugh it looks like I was on an outdated version of the repo. After updating the repo and reinstalling it is working now.

mmisztal1980 commented 6 years ago

Added a PR where the probe settings are configurable via the chart values. That should help in our case.

mmisztal1980 commented 6 years ago

@mitchellh would the above PR be satisfactory? Tweaking the probe settings seems to have fixed the issue for myself and @jmreicha

adilyse commented 5 years ago

As mentioned in a comment on the PR, these configuration options won't be added to the values.yaml at this time. If you need further customization on the Helm chart, read more about options here.

hashicorp / consul-helm

Consul fails to start #10

Results

Compare to previous results

Hypothesis (FailureTreshold?)