Open jails opened 7 years ago
This kind of "silent" error is an issue since domain certificates won't be updated if that error happens in the middle of the certificate validity period (and 90 days is a long enough period). Would it be difficult to make /healthz
to also check the kubernetes API connection @munnerz ? So that kube lego is automatically restarted by kubernetes when that kind of error occur.
We just ran into this issue as well
Hi, today I runned kube-lego for the first time and everything worked fine but after a couple of hours or so I got the following logs the kube lego pod (see the error at the end):
So looks like kube lego is losing its connection over the kubernetes API. However the connection URL was ok:
Then I restarted the kube-lego deployment (ie. kubectl delete & kubectl apply) everything get back to normal again.
Before I saw this error, I noticed that the kubernetes cluster autoscaled up & down and gets unavailable a minute or so (saw the spinner in front the the cluster name in the Google Cloud admin UI). However no down time noticed. Maybe the kubernetes TLS certificates of the apiserver has been updated at some point (cluster update) and kube-lego was trying to connect to the kubernetes API using some deprecated certificates ?