unixcharles / acme-client

A Ruby client for the letsencrypt's ACME protocol.
MIT License
495 stars 116 forks source link

Nonces expiration and Acme::Client::Error::BadNonce #148

Closed cema-sp closed 6 years ago

cema-sp commented 6 years ago

Greetings! We are using acme-client to update certificated of managed domains in an automated way. We have created a daemon that sits in memory and recurrently orders certificates. It memoizes client instance and re-uses it on every scheduled run.

The client stores anti-reply nonces in memory and uses then on demand.

What we have noticed is that Letsencrypt nonces could (most probably) expire after approximately 24 hours, which results into the following error

Acme::Client::Error::BadNonce: JWS has an invalid anti-replay nonce

Because of that reason we had to refrain from memoizing the client.

Do you think it would be useful to allow acme-client users to provide something like a "nonce store instance"? It could be optional and configurable, and would allow one to store nonces keeping in mind their expiration.

# lib/acme/client.rb
...
  def initialize(jwk: nil, kid: nil, private_key: nil, directory: DEFAULT_DIRECTORY, connection_options: {}, nonce_store: [])
    ...
    @nonces ||= nonce_store
  end
...

# my_client.rb

my_store = StoreWithExpiration.new
client = Acme::Client.new(nonce_store: my_store)
unixcharles commented 6 years ago

What about just retrying on Acme::Client::Error::BadNonce?

Nonce expiration is not really part of the spec, so I prefer to make no assumption about it and consider it an implementation details. I would generally consider BadNonce as retry-able.

cema-sp commented 6 years ago

@unixcharles Thank you for answering, I'll try this approach.

cpu commented 6 years ago

Nonce expiration is not really part of the spec, so I prefer to make no assumption about it and consider it an implementation details. I would generally consider BadNonce as retry-able.

Indeed, and retrying on badNonce is in the spec!

An error response with the "badNonce" error type MUST include a Replay-Nonce header with a fresh nonce. On receiving such a response, a client SHOULD retry the request using the new nonce.

:+1:

What we have noticed is that Letsencrypt nonces could (most probably) expire after approximately 24 hours, which results into the following error

Speaking specifically to Let's Encrypt the time that it takes for a nonce to fall out of the active pool is a byproduct of traffic volume and so its very difficult to firmly establish a lifetime. Today you might see ~24hrs and tomorrow it could be 10m. The other thing is that nonces are per-datacenter. If we swap active datacenters during a maintenance, or if load balancing changes, a previously fetched nonce may suddenly be invalid (and retrying is the best option).

Hope that extra detail helps!

cema-sp commented 6 years ago

@cpu Thank you, that's super helpful 👍 I believe the issue could be closed.