cert-manager / cert-manager

Automatically provision and manage TLS certificates in Kubernetes
https://cert-manager.io
Apache License 2.0
11.9k stars 2.05k forks source link

Option to store certificate history in individual secrets #6224

Open Jamstah opened 1 year ago

Jamstah commented 1 year ago

Is your feature request related to a problem? Please describe.

A common pattern to build a locally trusted CA with cert-manager is to create a self-signed issuer, use it to issue a CA certificate, then use that certificate with a ca issuer to issue leaf certificates. A common request with this configuration is to be able to supply a trust bundle to clients with at least the latest and previous certificate so they can trust the CA, and support rotation of that CA. In cases of compromise and early rotation, more than just one previous certificate may be required.

Describe the solution you'd like

This feature request is for a field in the Certificate CR that enables certificate history to be stored on the cluster as individual Secret resources for each certificate generated, which the user can then bundle up into a trust store using something like trust manager.

For example, this in the spec:

spec:
  certificateHistory:
    secretPrefix: my-certificate-history

Then for each revision of the certificate generated, a secret would be generated named <prefix>-<revision. We would need to label these with a secretTemplate, either the same one as for the certificate secret, or a similar scheme in the certificateHistory block.

Describe alternatives you've considered

Initial thought was to use the CertificateRequest resource to build the trust store, but there are concerns there about overloading the purpose of those resources. See https://github.com/cert-manager/trust-manager/issues/144.

Additional context

By adding support for certificate history, an end user could use the resources to build a trust bundle including every certificate issued by the self-signed issuer, automatically extending it as the root certificate is rotated. If a specific issued root certificate is considered compromised, the user could delete the specific Secret and it would be removed from the trust bundle with minimal overhead.

Assuming this proposal is accepted, I will make a proposal for trust-manager to enable selecting secrets by label using standard matchLabels/matchExpressions syntax.

spec:
  sources:
  - configmaps:
      matchLabels:
        mylabel: myvalue
      matchExpressions:
      - { key: mylabel, operator: In, values: [myvalue] }

/kind feature

Jamstah commented 1 year ago

I've had a look at the code and have a good idea about how to adjust the flow to ensure that the history is always complete. Happy to write code for this but thought it was sensible to get consensus on the idea first.

munnerz commented 1 year ago

I would love to be able to do something like this - it would enable the use of things like immutable secrets too.

A few questions:

1) when/how do these get cleaned up? 2) how do we expect end-users to actually make use of these? 3) if someone mounts these into a Pod, this will require some automation controller to update their Deployment spec (or delete and recreate the Pod with the new spec) 4) if someone uses these with an ingress, how do we expect these to be updated? 5) how would this work with ingress-shim?

I think this is a great feature request, however there's a few practical questions about how these would be used that'd be great to get clarity on so we can understand the expected scope of the request :)

munnerz commented 1 year ago

If this is specifically to address the trust-manager use case when building a CA store, I am inclined to actually go towards the approach of having issuer.status.caBundle which would contain the set of CAs needed to verify all known valid certificates issued by that CA, aka this could contain a list of many CAs which would be removed automatically as and when we can be sure they are no longer in use.

Hence asking for some expansion around the use-case outside of trust-manager and building CA stores :)

Jamstah commented 1 year ago

when/how do these get cleaned up?

I would say in the same way that CertificateRequest resources get cleaned up - by the end user.

how do we expect end-users to actually make use of these?

However they want to. I would suggest trust-manager but I was trying to propose a generic scheme that anyone could take advantage of.

if someone mounts these into a Pod, this will require some automation controller to update their Deployment spec (or delete and recreate the Pod with the new spec)

Yes, which is why putting trust-manager into the mix to bundle them together is my suggestion, to avoid having to do that. We could alternatively put them all into one secret with entry like cert-<revision>, but that feels more brittle for cases like removing compromised certificates and other management.

if someone uses these with an ingress, how do we expect these to be updated?

In what way, you mean to trust the backend service for an ingress? I expect the ingress would need to point to a bundle of all the certificates created by a separate process.

how would this work with ingress-shim?

I'm not sure it does. Ingress shim seems more about providing an active certificate with its private key to secure an endpoint. This issue describes part of the solution to trusting that certificate (and by extension, that endpoint) based on a trust store with the CA history.

If this is specifically to address the trust-manager use case when building a CA store, I am inclined to actually go towards the approach of having issuer.status.caBundle which would contain the set of CAs needed to verify all known valid certificates issued by that CA, aka this could contain a list of many CAs which would be removed automatically as and when we can be sure they are no longer in use.

I thought in general cert-manager didn't want to be involved in building trust bundles, which is why trust-manager exists, hence looking for a generic mechanism that could be used by other tools as well. It also only makes sense for the CAIssuer - with other issuers we have no reliable way of determining the correct root certificates to provide to clients.

Is there an issue/discussion around this? I'm not sure how we could identify when a certificate is no longer in use.

Jamstah commented 1 year ago

Another use case of this mechanism could be for certificate audit purposes, to maintain a history of certificates that can be traced back reliably.

munnerz commented 1 year ago

I thought in general cert-manager didn't want to be involved in building trust bundles

I think that's (sort of) correct, but I do think it is in scope for cert-manager to handle/enable some form of graceful CA rollover. It's a long asked for and missed feature I think.

with other issuers we have no reliable way of determining the correct root certificates to provide to clients.

This would be an optional field that issuer implementations could choose to populate. For core issuers, the CA issuer, Vault issuer and potentially Venafi (depending on how that is set up) could all make use of it. For ACME, I think if someone runs their own ACME server and does have a way to access the CA data, they could still have a controller that populates/manages the status.caBundles field to enable this sort of thing.. though I am not sure that is in scope here.

Another use case of this mechanism could be for certificate audit purposes, to maintain a history of certificates that can be traced back reliably.

I don't think this should be pitched as an audit feature. It'd be a "poor mans CT log" and I don't personally trust the integrity of the data store either (anyone can come along and delete the object, and the only evidence of that would be the apiserver audit log, which itself also already contains the 'create' events, effectively defeating the value of using the apiserver for this audit purpose).

Is there an issue/discussion around this? I'm not sure how we could identify when a certificate is no longer in use.

"no longer in use" IMO here means "has passed its expiry". We can't know if it is still in use, as the certificate may be copied to some external system etc. Typically, CAs wait until the last issued certificate has expired to stop including that older CA in its bundle.


As an alternative to this for trust-manager specifically right now, without having support for the issuer.status.caBundle in core cert-manager, trust-manager could just look at the issued certificate/Secret, and if it finds a CA in there that it hasn't already persisted into the 'target' trust bundle, it could append it there. Just trying to think of ways we can do this more easily and side-step some of the harder parts above.. :)

Jamstah commented 1 year ago

I think that's (sort of) correct, but I do think it is in scope for cert-manager to handle/enable some form of graceful CA rollover. It's a long asked for and missed feature I think.

That's the line I Was trying to walk. Suggest a feature that isn't just "Provide a trust bundle for issuers" but would enable that use case to be developed in a declarative, cloud native way.

This would be an optional field that issuer implementations could choose to populate. For core issuers, the CA issuer, Vault issuer and potentially Venafi (depending on how that is set up) could all make use of it. For ACME, I think if someone runs their own ACME server and does have a way to access the CA data, they could still have a controller that populates/manages the status.caBundles field to enable this sort of thing.. though I am not sure that is in scope here.

I would love to see this implemented, is there a proposal anywhere that already exists? I do wonder if it falls on the wrong side of the trust bundle line. It invites cases like this where the user expects the generated certs to be updated with new CA data too: https://github.com/cert-manager/cert-manager/issues/5851. It's debatable whether a generated certificate should even have a ca.crt field in it.

I don't think this should be pitched as an audit feature.

Good points, I agree.

"no longer in use" IMO here means "has passed its expiry". We can't know if it is still in use, as the certificate may be copied to some external system etc. Typically, CAs wait until the last issued certificate has expired to stop including that older CA in its bundle.

Exactly and we have no way to know when the last issued certificate has expired. Having old ca certificates in the trust bundle isn't incredibly dangerous because they're expired anyway and public knowledge. Either way, I think that's not a cert-manager job.

As an alternative to this for trust-manager specifically right now, without having support for the issuer.status.caBundle in core cert-manager, trust-manager could just look at the issued certificate/Secret, and if it finds a CA in there that it hasn't already persisted into the 'target' trust bundle, it could append it there. Just trying to think of ways we can do this more easily and side-step some of the harder parts above.. :)

I think that opens race conditions where a generated cert could be missed if an operator was out of action during its creation. I suggested something similar using the CertificateRequest objects that cert-manager already creates and form a decent history here: https://github.com/cert-manager/trust-manager/issues/144

Jamstah commented 1 year ago

I see you've been debating this since 2020 :)

jetstack-bot commented 10 months ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Send feedback to jetstack. /lifecycle stale

jetstack-bot commented 9 months ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. If this issue is safe to close now please do so with /close. Send feedback to jetstack. /lifecycle rotten /remove-lifecycle stale

jetstack-bot commented 8 months ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten. Send feedback to jetstack. /close

jetstack-bot commented 8 months ago

@jetstack-bot: Closing this issue.

In response to [this](https://github.com/cert-manager/cert-manager/issues/6224#issuecomment-1859202471): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. >Send feedback to [jetstack](https://github.com/jetstack). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
wallrj commented 1 week ago

/reopen

cert-manager-prow[bot] commented 1 week ago

@wallrj: Reopened this issue.

In response to [this](https://github.com/cert-manager/cert-manager/issues/6224#issuecomment-2317244008): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.