gregtwallace / certwarden

Cert Warden is a centralized ACME Client. It provides an API for certificate consumers to fetch their individual keys and certs with API keys.
https://www.certwarden.com/
Other
231 stars 7 forks source link

random badNonce error #49

Closed vilpalu1 closed 7 months ago

vilpalu1 commented 7 months ago

hello, i am using certwarden with hashicorp vault. when i trying to issue certs using DNS challenge i am getting: error, orders/fulfilling_do.go:101, order fulfilling worker 0: fulfill auths error: %!w(*acme.Error=&{0 urn:ietf:params:acme:error:badNonce invalid or reused nonce: The client sent an unacceptable anti-replay nonce

also sometimes request goes without errors. Tried to reregister acme account. if i am using lets encrypt staging goes without errors. maybe you can suggest how to debug or find root cause of this and solve this problem ?

gregtwallace commented 7 months ago

What ACME service are you using? Can you provide the directory URL? Cert Warden is designed to retry when the badNonce error is received (https://github.com/gregtwallace/certwarden-backend/blob/cd86c52c38a8d7523540ec21ac7c01e1c8602b9a/pkg/acme/post_signed.go#L160). Let's Encrypt will actually return this error sometimes with a valid nonce and the retry deals with it.

I suspect this is an issue with the provider.

Also, can you provide the debug log surrounding the issue (for the entire order, preferably)? If you don't want to post it you can email it to me.

vilpalu1 commented 7 months ago

acme provider is localhosted hashicorp vault url looks like https://LOCAL_DOMAIN:8200/v1/pki_vilpalu/acme/directory certbot works without problems tested various subdomain and domain names

creating new certificate with new private key on certwarden full log:

4/24/2024, 11:37:39 AM, info, orders/fulfilling_do.go:102, order fulfilling worker 1: order 85 done
4/24/2024, 11:37:39 AM, error, orders/fulfilling_do.go:101, order fulfilling worker 1: fulfill auths error: %!w(*acme.Error=&{0 urn:ietf:params:acme:error:badNonce invalid or reused nonce: The client sent an unacceptable anti-replay nonce})
4/24/2024, 11:37:37 AM, error, acme/post_signed.go:177, failed to save response replay nonce (cannot save empty nonce)
4/24/2024, 11:37:29 AM, info, auth/handlers.go:146, client x.x.x.x:38383: access token refresh for user 'admin' succeeded
4/24/2024, 11:37:29 AM, info, auth/handlers.go:108, client x.x.x.x:38383: attempting access token refresh
4/24/2024, 11:35:36 AM, info, orders/fulfilling_do.go:24, order fulfilling worker 1: ordering order id 85 (certificate name: test54.kar88.REMOVED.com, subject: test54.kar88.REMOVED.com)
gregtwallace commented 7 months ago

I'm pretty confident after looking at the code for hashcorp vault that they did not implement the spec correctly. A badNonce error is supposed to return the nonce to use for the next call but based on the code and your log it doesn't appear to be doing that.

I can look at putting a bandaid on to fix this (for this and other non-compliant servers). It would be more proper to fix the server though.

If you can enable debug logging in the config and post the next incident in its entireity I can probably confirm the issue with certainty. I've got a small patch that will work around the issue on the client side. I haven't reviewed the Certbot code but it might have a bandaid. Optionally, clients aren't required to use the nonce returned with the error so it might just drop that nonce and use a different one (which is also compliant with the RFC and would not cause the issue being seen here).

vilpalu1 commented 7 months ago

builded docker container with your path and works like a charm. Thanks. looks like you were right about non compliant vault acme implementation, because i am getting these "errors", but all certs are valid: 4/25/2024, 10:43:24 AM, info, orders/fulfilling_do.go:225, order fulfilling worker 2: order 108 done 4/25/2024, 10:43:24 AM, info, orders/fulfilling_do.go:223, order fulfilling worker 2: order id 108 completed with status valid (certificate name: kar123.REMOVED.com, subject: kar123.REMOVED.com) 4/25/2024, 10:43:07 AM, warn, acme/post_signed.go:171, acme signed post: err badNonce but acme server did not provide new nonce in error response (server violates the spec; report it to the server dev) 4/25/2024, 10:41:05 AM, warn, acme/post_signed.go:171, acme signed post: err badNonce but acme server did not provide new nonce in error response (server violates the spec; report it to the server dev) 4/25/2024, 10:41:05 AM, info, orders/fulfilling_do.go:24, order fulfilling worker 2: ordering order id 108 (certificate name: kar123.REMOVED.com, subject: kar123.REMOVED.com)

gregtwallace commented 7 months ago

Great. It’ll be in the next official build.