Closed laurieodgers closed 5 years ago
+1 Same issue
I don't think your issues will be resolved by that PR. This is actually a known issues, albeit only known as a TODO here: https://github.com/jetstack/cert-manager/blob/master/pkg/controller/acmeorders/sync.go#L136-L137
We've erred on the side of caution here, so we don't over-query the ACME server for this information. We'll need to implement some kind of periodic resync of all pending challenge & order resources.
/remove-kind bug /kind feature /priority important-longterm /area acme
A couple of my HTTPS certs have also just expired, 90 days after creation:
$ kubectl get secret
NAME TYPE DATA AGE
...
something-tls kubernetes.io/tls 3 89d
otherthing-tls kubernetes.io/tls 2 109d
...
niftyapp-tls kubernetes.io/tls 2 110d
soupmachine-tls kubernetes.io/tls 2 90d
...
I solved the issue by deleting the stale certificate, which triggered generation of a new certificate, and it solved the issue.
$ kubectl delete secret soupmachine-tls
secret "soupmachine-tls" deleted
$ # wait a bit... maybe check progress with `kubectl logs -n cert-manager cert-manager-*** -f`
$ kubectl get secret
NAME TYPE DATA AGE
...
something-tls kubernetes.io/tls 3 89d
otherthing-tls kubernetes.io/tls 2 109d
...
niftyapp-tls kubernetes.io/tls 2 110d
soupmachine-tls kubernetes.io/tls 2 3m
...
This is how I managed to fix the "Expired authorization" issue.
I installed cert-manager around 4 months ago.
Then, I upgraded cert-manager around 1 month ago. That didn't go well, because I had not read the upgrade guide. So I reinstalled cert-manager, but it seems I kept the old certs (judging by the age of the secret
s showed above).
As I understand @munnerz's reply, the thing that happened here is that my local cert-manager
's view of when the cert needs to be updated is out of sync with Let's Encrypt's view. The "timer" might have been reset when I re-installed cert-manager.
Please correct me if my theory is wrong, I'd just like to understand the issue.
I also noticed I have almost no orders.certmanager.k8s.io
. Much less than certificates.certmanager.k8s.io
. That could be relevant as well.
$ kubectl get orders.certmanager.k8s.io --no-headers --all-namespaces | wc -l
3
$ kubectl get certificates.certmanager.k8s.io --no-headers --all-namespaces | wc -l
7
In your particular case, it'd be more ideal to not actually start the Order flow until you know the self check will pass (i.e. perform the self check with some dummy values ahead of time).
cert-manager is not currently well geared for this use case and would require some extra work to make it possible.
If you wanted to achieve this today, you might be able to create/inject fake 'Challenge' resources in order to trigger at least the self check stage of the authorization flow, but you may run into some weird issues.
Understood. I've gone ahead and put some checks to ensure that DNS has been cut over before creating certificate/ingress objects within k8s. This will help us save on resource usage as well.
I'm satisfied with this outcome so its up to you if you wish to close or keep this ticket open.
Thanks for the help and for the great piece of software!
Got almost into the same situation as describe above but i read the upgrade guide... Jump from 0.7 to 0.7.2 versions hoping it will resolve this error... The only one thing that I don`t clearly understand is if cert manager was redeployed into new namespace (cert-manager) and old certs with secrets were created earlier in "default" namespaces -> can this cause the situation described above? Should I "re-order" new certs by deleting the old ones or cert-manager can handle that automatically? Looks like i need to update to 0.8.0 due to https://github.com/jetstack/cert-manager/pull/1603
PS Solved that by deletion of old certs and secrets.
➜ ~ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
cert-manager cert-manager-6f5c7f9bff-w94ql 1/1 Running 0 2h
cert-manager cert-manager-cainjector-64c799c8f9-59p69 1/1 Running 0 2h
cert-manager cert-manager-webhook-7646699c48-6kbhv 1/1 Running 0 2h
default cm-acme-http-solver-25h87 1/1 Running 0 2h
default cm-acme-http-solver-9wqqs 1/1 Running 0 2h
default cm-acme-http-solver-hhq86 1/1 Running 0 2h
default cm-acme-http-solver-tpjt7 1/1 Running 0 2h
kubectl describe challenge
Status:
Presented: true
Processing: true
Reason: Error accepting challenge: acme: urn:ietf:params:acme:error:malformed: Expired authorization
State: pending
kubectl get orders
NAME STATE AGE
XXX pending 10d
XXX pending 22d
XXX pending 22d
XXX pending 22d
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to jetstack.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to jetstack.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to jetstack.
/close
@retest-bot: Closing this issue.
I see this issue was closed by a bot. has anyone found a resolution or root cause?
We had to delete epxired challenges
Background: We have a unique use case in that we are using cert-manager to retrieve SSL certs for customer domains, as we offer wholesale services to our customers. We have had at least 50 acme challenge http01 pods running for ~26 days while we wait for our wholesale customers to change over their DNS from our old portal to new portal.
Describe the bug: While testing the DNS cutover internally, an issue with authorization was encountered. ACME HTTP01 challenge pods don't seem to be updated with new authorizations from letsencrypt.
Expected behaviour: Acme challenge pods should have the correct authorization when letsencrypt update their side, allowing the certificate to be issued.
Steps to reproduce the bug: Run a HTTP01 acme challenge pod for long enough for the authorization to expire.
Anything else we need to know?: This may well be solved by the fix to my previous bug report #1311 and subsequent PR #1388 but this hasn't been released in an official version yet so I've been unable to test.
Workaround is to delete/recreate the relevant certificate/ingress k8s api objects.
Logs: I challenges controller: syncing item 'web/whitelabel-customer1-0' I whitelabel-customer1-0: Error accepting challenge: acme: urn:ietf:params:acme:error:malformed: Expired authorization E challenges controller: Re-queuing item "web/whitelabel-customer1-0" due to error processing: acme: urn:ietf:params:acme:error:malformed: Expired authorization I orders controller: syncing item 'web/whitelabel-customer1' I challenges controller: syncing item 'web/whitelabel-customer1-0' I Waiting for all challenges for order "whitelabel-customer1" to enter 'valid' state I orders controller: Finished processing work item "web/whitelabel-customer1" I whitelabel-customer1-0: Error accepting challenge: acme: urn:ietf:params:acme:error:malformed: Expired authorization E challenges controller: Re-queuing item "web/whitelabel-customer1-0" due to error processing: acme: urn:ietf:params:acme:error:malformed: Expired authorization I challenges controller: syncing item 'web/whitelabel-customer1-0' I whitelabel-customer1-0: Error accepting challenge: acme: urn:ietf:params:acme:error:malformed: Expired authorization E challenges controller: Re-queuing item "web/whitelabel-customer1-0" due to error processing: acme: urn:ietf:params:acme:error:malformed: Expired authorization I challenges controller: syncing item 'web/whitelabel-customer1-0' I whitelabel-customer1-0: Error accepting challenge: acme: urn:ietf:params:acme:error:malformed: Expired authorization E challenges controller: Re-queuing item "web/whitelabel-customer1-0" due to error processing: acme: urn:ietf:params:acme:error:malformed: Expired authorization I challenges controller: syncing item 'web/whitelabel-customer1-0' I whitelabel-customer1-0: Error accepting challenge: acme: urn:ietf:params:acme:error:malformed: Expired authorization E challenges controller: Re-queuing item "web/whitelabel-customer1-0" due to error processing: acme: urn:ietf:params:acme:error:malformed: Expired authorization I challenges controller: syncing item 'web/whitelabel-customer1-0' I ingress-shim controller: syncing item 'web/whitelabel-a5eqwj3xcxhtrhus5deoxlla36zqszqq' I Not syncing ingress web/whitelabel-customer1 as it does not contain necessary annotations I ingress-shim controller: Finished processing work item "web/whitelabel-customer1"
Environment details::
/kind bug