canonical / manual-tls-certificates-operator

A charmed operator for managing TLS certificates manually.
https://charmhub.io/manual-tls-certificates
Apache License 2.0
2 stars 11 forks source link

CSR secrets are never deleted on relation-broken #264

Open lucabello opened 1 month ago

lucabello commented 1 month ago

Bug Description

Issue

Removing and recreating a certificates relation multiple times leaves behind all the Juju Secrets for the previous CSRs.

I believed the old ones should be cleaned up by the requirer charm.

A mechanism for deleting the secret is already in place here.

To Reproduce

juju deploy self-signed-certificates --channel=latest/edge ca
juju deploy prometheus-k8s --channel=latest/edge prometheus
# Run the following a few times
juju relate prometheus ca
juju remove-relation prometheus ca
juju secrets  # notice the amount of secrets grow indefinitely

Environment

latest/edge for both charms:

Relevant log output

-

Additional context

No response

lucabello commented 1 month ago

Side question: why is the CSR+Certificate saved to a Juju secret? It's only touched on the requirer side: could we remove that entirely and just use relation data? (the information is already in relation data)

gruyaume commented 1 month ago

Right now secrets are going to be removed when they expire assuming the relation was removed. So I wouldn't say that they are "never deleted".

Removing certificate secrets on relation broken makes sense in principle. Though in practice I'm not sure how we could retrieve the list of secrets created by the TLS library. As far as I know, Ops does not have a way to get multiple secrets. Secrets can be retrieved one at the time provided we know their label which we won't know because they are based on the CSR (which won't be available on relation broken).

As for the reason why we use secrets, it's specifically to leverage the expire mechanisms that come with them. We set secret expiry time based on certificate expiry time in order to inform the charm that their cert is about to expire just in time.

lucabello commented 1 month ago

I see, thanks for the explanation :) The two possibilities I see are:

gruyaume commented 1 month ago

Relating to proposal 1, we are really trying to not keep state in this library as much as possible so I'd try to avoid it.

Relating to proposal 2, there can be more than 1 certificate (or secret) per unit so we'd need to add a suffix at the end of the label that the TLS library would need to keep a track of, making the proposal equivalent to proposal 1.

I would prefer a third option which would be the ability to get the list of secrets using ops. @benhoyt is this a reasonable request?

benhoyt commented 1 month ago

I think proposal 2 might still be an option, if you can add an incrementing suffix suffix to the end of the label. Then to enumerate them you could iterate through suffixes 1, 2, 3 ... until Model.get_secret failed with a SecretNotFoundError and you'd know you'd reached the end.

Alternatively, there is a secret-ids hook tool, but it's not currently modelling by Ops. I think this would be a reasonable addition: perhaps Model.list_secrets() -> List[Secret] or something. However, I'd love to see that it's actually going to be useful for you first. Maybe you can try to implement this using subprocess.run(['secret-ids']) directly and report back?