deis / router

Edge router for Deis Workflow
https://deis.com
MIT License
80 stars 57 forks source link

Deis-Router is Dead and not recovering #242

Closed felipejfc closed 8 years ago

felipejfc commented 8 years ago

This is the pod state:

pod deis-router-1536869610-u3hmp --namespace deis
Name:       deis-router-1536869610-u3hmp
Namespace:      deis
Node:       ip-172-21-110-243.ec2.internal/172.21.110.243
Start Time:     Fri, 19 Aug 2016 23:45:46 -0300
Labels:     app=deis-router
            pod-template-hash=1536869610
Status:     Running
IP:
Controllers:    ReplicaSet/deis-router-1536869610
Containers:
  deis-router:
    Container ID:       docker://7acf711c39f38f7b6224ba89f7061947e8e6198e9ebf931e39831e9975a47548
    Image:          quay.io/deis/router:v2.4.0
    Image ID:       docker://sha256:41b16a8a4f6875309e8c58665b87254dd53942e31cda29713aa2be90e32db7a7
    Ports:          8080/TCP, 6443/TCP, 2222/TCP, 9090/TCP
    State:          Terminated
      Reason:       Completed
      Exit Code:        0
      Started:      Wed, 24 Aug 2016 11:02:09 -0300
      Finished:     Thu, 25 Aug 2016 03:34:26 -0300
    Last State:     Terminated
      Reason:       Completed
      Exit Code:        0
      Started:      Mon, 22 Aug 2016 23:11:39 -0300
      Finished:     Wed, 24 Aug 2016 11:01:54 -0300
    Ready:          False
    Restart Count:      3
    Liveness:       http-get http://:9090/healthz delay=1s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:9090/healthz delay=1s timeout=1s period=10s #success=1 #failure=3
    Environment Variables:
      POD_NAMESPACE:    deis (v1:metadata.namespace)
Conditions:
  Type      Status
  Initialized   True
  Ready         False
  PodScheduled  True
Volumes:
  deis-router-token-mqpzl:
    Type:       Secret (a volume populated by a Secret)
    SecretName: deis-router-token-mqpzl
QoS Tier:       BestEffort
Events:
  FirstSeen     LastSeen        Count   From                        SubobjectPath       Type        Reason      Message
  ---------     --------        -----   ----                        -------------       --------        ------      -------
  2h        51s         86      {kubelet ip-172-21-110-243.ec2.internal}        Warning         FailedMount     Unable to mount volumes for pod "deis-router-1536869610-u3hmp_deis(3239d886-6680-11e6-8255-0a5b676bb4e9)": timeout expired waiting for volumes to attach/mount for pod "deis-router-1536869610-u3hmp"/"deis". list of unattached/unmounted volumes=[deis-router-token-mqpzl]
  2h        51s         86      {kubelet ip-172-21-110-243.ec2.internal}        Warning         FailedSync      Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "deis-router-1536869610-u3hmp"/"deis". list of unattached/unmounted volumes=[deis-router-token-mqpzl]

It was working ok then it suddenly entered this state, looping with this error what is making it unavailable.

Deleting the POD and letting the RC create another one solved the problem.

Deis Workflow version 2.4.1 Kubernetes version 1.3.5 @ AWS

mboersma commented 8 years ago

Odd, the general issue with timeouts mounting volumes that we were aware was fixed in an earlier k8s version...1.3.4 I think.

Deleting the POD and letting the RC create another one solved the problem.

This seems like a transient k8s error that hasn't been reproduced in our testing. I'm going to close this, but please re-open it if this recurs and there is something Deis Workflow can fix.