Open jvanasco opened 1 year ago
If the
end
of thesuggestedWindow
is in the past, clients SHOULD attempt renewal immediately.
Isn't that already implied in the spec? (Maybe it wasn't a year ago when this issue was posted.) The recommended algorithm is to renew if the selected date within the window is in the past, and if end
is in the past, then necessarily any date in the window is in the past too.
My reading of the current draft suggests no additional text is needed.
No change to the algorithm is suggested by this. That line could be updated to whatever the current "Immediate Renewal" (IR) language is.
The purpose of this Issue is to recommend the logging and investigation of all detected IRs – and explain that in most situations, an IR (as indicated by a past expiration time in the payload) is almost certainly because of a server misconfiguration or CA revocation.
AFAIK, the only times a Subscriber should expect an IR are:
Unless you're expecting an IR payload for those reasons, detecting one generally means that something, somewhere, has broken – so Clients should log and alert Subscribers to investigate.
IMHO, aside from those two situations, the most likely causes of an IR are going to be (in descending order):
I don't know if you might just call this "Server misconfiguration", but it might be good to consider cases where the server isn't up 24/7. For home appliance-type servers (like a network-attached storage device or whatnot), if I have it turned off for a week when I go on vacation, and then when I turn it back on it turns out that it missed its ARI window and needs to renew immediately, that really isn't a "critical failure that needs to be investigated now" scenario.
I think the real indicator of something wonky requiring investigation might just be whether that explanationUrl is present. But there isn't much guidance to CAs on when to populate it, so I could imagine one CA always populating it with a link to their regular documentation about preferred renewal timelines, and another CA only populating it when initiated by a compliance incident. It might be worth having it be a "SHOULD" or "RECOMMENDED" or the like for that explanationUrl field to be populated if the certificate was revoked (or is about to be revoked), and maybe guidance that it shouldn't be populated for "normal" time windows.
A server with periodic connectivity is a specific case with a minority of users, and obviously/inherently not a misconfiguration. We can easily generate a long list of other specific usages and edge cases that will affect 2%-20% of users - that should not prevent advising the majority of Subscribers that a missed ARI window is likely something that should be quickly looked at.
Sure; not really objecting. I'm just not sure how much of this "implementation guidance" should be in the RFC, vs. some other place. (And I mean that "not sure" sincerely, this may be the best place for it.) And all I was trying to do is to ensure that the server with intermittent connectivity was thought of somewhere in the process, I know it's not the common use case.
I think this is largely why I'm not in favor of adding language to this effect. There are many reasons for "immediate renewal" scenarios:
Also:
I could imagine one CA always populating [the explanationURL] with a link to their regular documentation about preferred renewal timelines
Let's Encrypt plans to do exactly this, so I'm not a fan of saying CAs should avoid populating it
Yeah, I think the thing that needs to be logged and investigated is if the certificate wasn't renewed before some percentage of its lifetime. Even if it renews really early because of some CA incident, there really isn't anything for the server owner to do in that case; everything went exactly as it should.
We regards to explanationURL, I guess I'm just not sure exactly what the client or administrator should really do with that information. I guess it could be helpful to be logged just in case the administrator is curious about why a certificate was renewed early, but again if the renewal is successful then I don't think there's any action they should really be taking. The scenario that might be more meaningful is when the suggested window is in the past and renewal fails, and in that case the administrator needs to be alerted because it may indicate a future problem (CA planned downtime or incident requiring revoking or whatnot) that they need to figure out how to work around (by ensuring that their system answering challenges is up, switching CAs, or whatever), and the explanationURL might help them understand the impact.
Yeah, I think the thing that needs to be logged and investigated is if the certificate wasn't renewed before some percentage of its lifetime.
We do this in Caddy/CertMagic... if it's the last 1/20th or 1/50th of its lifetime (there's two code paths - one for ARI, one without) we emit a slightly louder log saying that we're renewing now (in the ARI code path, we specifically mention that we're ignoring ARI at that point).
We regards to explanationURL, I guess I'm just not sure exactly what the client or administrator should really do with that information. I guess it could be helpful to be logged just in case the administrator is curious about why a certificate was renewed early, but again if the renewal is successful then I don't think there's any action they should really be taking. The scenario that might be more meaningful is when the suggested window is in the past and renewal fails, and in that case the administrator needs to be alerted because it may indicate a future problem (CA planned downtime or incident requiring revoking or whatnot) that they need to figure out how to work around (by ensuring that their system answering challenges is up, switching CAs, or whatever), and the explanationURL might help them understand the impact.
Yeah, we are just logging the explanationURL, and if there's a failure, you can go into the same logs that report the error, and find the explanationURL. :man_shrugging:
Consider a subsection under the "Getting Renewal Information" titled "Immediate Renewal Scenarios".
It should explain some situations in which a "renew now" payload is sent, and a security audit or server configuration audit may be necessary.
Possible text: