orchestracities / ngsi-timeseries-api

QuantumLeap: a FIWARE Generic Enabler to support the usage of NGSIv2 (and NGSI-LD experimentally) data in time-series databases
https://quantumleap.rtfd.io/
MIT License
38 stars 49 forks source link

Too many restarts in K8s cluster deployment #179

Closed c0c0n3 closed 5 years ago

c0c0n3 commented 5 years ago

We've been experiencing an unusually high number of restarts in our K8s cluster. For example in the last 3 days K8s restarted QL 103 and 99 times in each of the two pods, respectively.

chicco785 commented 5 years ago

I think it happens when QL becomes unresponsive, and so it's killed by k8s:

  Warning  Unhealthy  54m (x1061 over 4d21h)    kubelet, ip-172-20-60-68.eu-central-1.compute.internal  Liveness probe failed: Get http://172.20.44.1:8668/v2/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  Normal   Pulling    54m (x101 over 4d21h)     kubelet, ip-172-20-60-68.eu-central-1.compute.internal  pulling image "smartsdk/quantumleap:rc"
  Normal   Killing    54m (x100 over 4d21h)     kubelet, ip-172-20-60-68.eu-central-1.compute.internal  Killing container with id docker://quantumleap:Container failed liveness probe.. Container will be killed and recreated.
  Normal   Pulled     53m (x101 over 4d21h)     kubelet, ip-172-20-60-68.eu-central-1.compute.internal  Successfully pulled image "smartsdk/quantumleap:rc"
  Normal   Created    53m (x101 over 4d21h)     kubelet, ip-172-20-60-68.eu-central-1.compute.internal  Created container
  Normal   Started    53m (x101 over 4d21h)     kubelet, ip-172-20-60-68.eu-central-1.compute.internal  Started container
  Warning  Unhealthy  53m (x3 over 4d21h)       kubelet, ip-172-20-60-68.eu-central-1.compute.internal  Liveness probe failed: Get http://172.20.44.1:8668/v2/health: dial tcp 172.20.44.1:8668: connect: connection refused
  Warning  Unhealthy  8m50s (x1127 over 4d21h)  kubelet, ip-172-20-60-68.eu-central-1.compute.internal  Readiness probe failed: Get http://172.20.44.1:8668/v2/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
chicco785 commented 5 years ago

i believe this was solved with allowing for yellow state of crate cluster.