jetstack / kube-lego

DEPRECATED: Automatically request certificates for Kubernetes Ingress resources from Let's Encrypt
Apache License 2.0
2.16k stars 267 forks source link

Recovering from Error 409: The resource '...' already exists, alreadyExists #95

Open devth opened 7 years ago

devth commented 7 years ago

I was running kube-lego:0.1.1 for several month using the GCE Loadbalancers solution. It's been working well, automatically renewing certs for 3 of my domains as needed until recently, when one of my domains certs stopped working because it was expired.

kube-lego is still updating the secret, but something is wrong with the Ingress. It has events on it:

Events:
  FirstSeen     LastSeen        Count   From                            SubObjectPath   Type            Reason  Message
  ---------     --------        -----   ----                            -------------   --------        ------  -------
  20d           7m              16175   {loadbalancer-controller }                      Warning         GCE     googleapi: Error 409: The resource 'projects/foo/global/sslCertificates/k8s-ssl-1-foo-bar--c2cd235f2196d4d5' already exists, alreadyExists
  10d           13s             14385   {loadbalancer-controller }                      Warning         GCE     googleapi: Error 409: The resource 'projects/foo/global/sslCertificates/k8s-ssl-1-default-qux--c2cd235f2196d4d5' already exists, alreadyExists

It looks like 0.1.2 might have addressed this issue, so I upgraded my kube-lego deployment to 0.1.3. It started up fine, checked for certs, but it didn't need to update the one that wasn't working since the cert in the stored secret is recent.

What's the best way to recover? Can I force kube-lego to refresh a cert?

devth commented 7 years ago

I increased LEGO_MINIMUM_VALIDITY to 80 days to force it to refresh. It successfully got a new certificate and stored it in the correct secret, but the alreadyExists issue remains.

simonswine commented 7 years ago

This is a GCE ingress controller bug, please file the bug here: https://github.com/kubernetes/ingress

devth commented 7 years ago

Filed https://github.com/kubernetes/ingress/issues/330. I guess I could just delete the SSL cert in gcloud, but I'm trying to figure out a non-destructive way to recover without downtime.

devth commented 7 years ago

@simonswine any thoughts on how to get momentum on the issue filed on kubernetes/ingress or workaround the issue? I can easily recover by deleting the ingress but if this was production that would incur downtown.

gianrubio commented 7 years ago

@devth this is related to https://github.com/kubernetes/ingress/issues/609