aarongable / draft-acme-ari

Internet Draft for the Automated Certificate Management Environment (ACME) Renewal Information (ARI) Extension
Other
3 stars 7 forks source link

A couple alternatives to consider instead of a window #44

Closed mholt closed 2 months ago

mholt commented 1 year ago

For reasons explained in this letter to the IETF WG:

https://mailarchive.ietf.org/arch/msg/acme/AeJ3zJKcBF-ZUhQXJajC0bb7orI/

I would like to explore some alternatives to the current draft. I can think of two approaches that might address those concerns:

A) Instead of a totally separate flow to obtain ARI, simply utilize a Retry-After header in the flow of existing ACME responses. Upon finalizing an order, the ACME server can respond with a Retry-After header which acts as the current-draft Retry-After header for ARI responses. The client then attempts renewal at/after the Retry-After time, but with the OCSP CertID added to the NewOrder object; this indicates to the ACME server that the client is asking if now is a good time to renew the certificate indicated by the CertID. If it's not a good time, the ACME server can reply as such, with another Retry-After, and the client then waits and repeats, until the server actually issues the certificate. If the client needs the certificate immediately, simply omit the CertID from the NewOrder and the normal, "non-ARI" flow is assumed. This is backwards-compatible and requires no additional infrastructure or endpoints.

B) If we do need a separate flow for some reason, I would like to see a single endpoint containing a static JSON resource that describes all the active certificates that need early renewal, rather than one tediously-crafted URL per certificate. Certificates can be described by their NotBefore or NotAfter dates, serial numbers, or other relevant attributes. For example, if just a few certs with certain serials were misissued, those serials could be enumerated at this endpoint. Or if a mass revocation is happening, the timeframe of NotBefore dates could be listed, and ACME clients can simply check against the certs they manage with those dates, and replace them. You can represent millions of certificates in, like, 85 bytes this way. And it's way less work for clients and servers. And lastly, drop the "window" idea -- certificates described by this endpoint should be renewed ASAP: try to renew immediately, then back off and retry, for reasons described above (once we know the future is uncertain and/or revocation is imminent, current certs can't be trusted and/or clients must try to preserve their sites' uptime).


Since you've probably heard most of the stuff in that letter already, I wanted to at least bring forth those two ideas for discussion here.

Thanks for your diligent work on ARI!

mholt commented 1 year ago

On the LE forums, it was mentioned that it would be useful if the CA could know that a certificate has specifically been replaced by another, and to know which ones those are.

Proposal (A) above I think should be able to solve that.

Francis Lavoie, on our Slack, made a good point to me:

Something we see is sometimes Caddy replaces an LE cert with ZeroSSL

So if we end up using a different CA to replace the certificate, what do you think should happen?

mholt commented 2 months ago

It looks like (A) has been partially implemented by adding the Replaces value into NewOrder flows, which I am pleased to see. I think that's an improvement. I still wouldn't mind consolidating the ARI flow into Order flows, but I understand this has been debated elsewhere already.

I am also still partial to (B) -- or at least most of it -- instead of a per-certificate endpoint. I understand that makes things easier for servers but it makes things more complicated for clients and doesn't scale as well.

aarongable commented 2 months ago

(B) describes what I would consider three separate changes:

I think that all three of these are not improvements to ARI. The first has already been extensively debated on mailing lists, and I won't reiterate the arguments here. The second means specifying (and implementing!) a whole collection of methods by which certificates could be grouped and those groups described, which I believe is not worthwhile and would make client implementation significantly more complex. The third does away with significant swathes of the utility of ARI, removing any ability of clients to plan ahead on short timescales.

Absent a suggested specification of exactly what this would look like and how it would be incorporated into the current ARI draft, I don't think it would be an improvement to accept these suggestions.