openbao / openbao

OpenBao exists to provide a software solution to manage, store, and distribute sensitive data including secrets, certificates, and keys.
https://openbao.org/
Mozilla Public License 2.0
2.97k stars 126 forks source link

PKI - Allow revocation of expired certificates (Vault #27609) #459

Closed cipherboy closed 6 days ago

cipherboy commented 2 months ago

As reported by @tchernobog on https://github.com/hashicorp/vault/issues/27609:

This is a follow-up of #19452.

Is your feature request related to a problem? Please describe.

This is for embedded software development.

We use the Vault PKI to sign software releases. Releases themselves are signed by a certificate which has a very short expiration date (hours), since they should not be used for anything else than signing the current release and then the private key is thrown away.

If however we have a software release which is deemed insecure, we use the CRL to block software updates to that version. Since a CVE / security finding can happen at any point in the years (!) to come, we need to be able to revoke an already expired certificate.

Describe the solution you'd like

We would like an optional parameter to the /pki/revoke endpoint which skips the expiration check in crl_util.go.

Alternatively, it might be even better and even easier to augment /pki/config/crl to a boolean option allow_expired_cert_revocation.

Describe alternatives you've considered

Extending the lifetime of the code signing certificate to match the intermediate. However, since also intermediates are rotated on a yearly or 6-months basis, we still have the same issue as any leaf certificate expiration will be bounded by that, and a revocation can happen far later.

We can obviously use /pki/issuer/:issuer_ref/sign-revocation-list and maintain that ourselves.

Else we would need to sign the "CRL" with a detached signature (and handle that in code).

Explain any additional use-cases

See #19452.

Additional context

N/A


19452's use case was less compelling than this one.

Cosign uses short-lived certificates (10 minutes) but currently lacks support for BYO PKI revocation. However, a similar use case to OP could definitely be implemented in the future: point-valid certificates whose (much) later revocation signals that the original certificate's signature should no longer be trusted or an upgrade skipped. This could be supported today with Cosign's verify-by-public-key, assuming the relevant cert/CRL validation infrastructure was handled OOB.

While strictly the CRL isn't an ideal mechanism (the certificate itself was not compromised and the corresponding key likely isn't recoverable), it could still be a useful mechanism for this since it is rather ubiquitous -- and often has different failure modes than cert expiry.

As proposed, allowing non-RFC 5280 behavior conditionally via a flag could be viable. This requires careful attention to tidy (as it will remove any expired certificate within the safety_buffer).

@tchernobog -- Is this an implementation of a standard by any chance?

tchernobog commented 2 months ago

@tchernobog -- Is this an implementation of a standard by any chance?

No, it is not strictly speaking a standard. It is however how RAUC (a common update tool in the embedded industry) also supports forbidding updates out-of-the-box: https://rauc.readthedocs.io/en/v1.7/advanced.html#sec-security

Similar tools in the automotive domain follow similar practices.

cipherboy commented 2 months ago

@tchernobog I think this is fairly easy to go add. I'd suggest a allow_expired_cert_revocation field to config/crl and the behavior changes, but also add a new field revoked_safety_buffer to default to safety_buffer but only applies to tidy_revoked_certs, so that expired-but-not revoked certs could potentially be removed earlier than revoked certificates (e.g., 2 years for non-revoked, 10 years for revoked).

Note that I'd generally recommend using no_store=true and doing BYO Cert revocation as storing lots of certs for a long period of time might make the PKI engine slower, though with #271 much of this slowness can be avoided. :-)

Let me know if you're interested in working on this or if you (or anyone else) has any questions!

fatima2003 commented 3 weeks ago

Hi @cipherboy, I want to give this issue a try. It's my first non-dependency-update feature so I think it'll be a fun learning process :)

cipherboy commented 3 weeks ago

Cool! Let me know if you have questions.

fatima2003 commented 2 weeks ago

Hi @cipherboy, as I was playing around with the parameters to test out the rules to understand what I need to change. I tried to revoke a revoked certificate and got a success message. It did not update the revocation time (which I think is ideal) but I think the success message is misleading, should I open an issue for this or is this intended behaviour?

Screenshot 2024-10-15 at 5 59 41 PM

I revoked multiple times after the first revocation time (Oct 15 2024 04:58:06 PM) and got the success message each time. Maybe a message stating, "Certificate already revoked" would be better.

cipherboy commented 2 weeks ago

@fatima2003 Yes, this is the expected behavior. All operations are meant to be repeatable; e g., if you delete a secret in KV and then redo that, it'll also report success. I think this stems from the API-first design approach: it doesn't really matter to a remote system whether this operation revoked the cert or whether another did, just that it was revoked by the time the API returned

fatima2003 commented 2 weeks ago

@cipherboy Oh okay, that makes sense, I'll look more into API-first design approach.