hyperledger / firefly-helm-charts

Firefly Helm Charts
https://github.com/hyperledger/firefly-helm-charts
Apache License 2.0
16 stars 24 forks source link

Configurable Certificate Expiry and Cert Renewal #37

Open onelapahead opened 2 years ago

onelapahead commented 2 years ago

With an old enough environment, DX's certs can expire when using cert-manager managed certificates. cert-manager automatically renews the cert but there's then a few challenges:

  1. Having DX pod be aware of the renewed cert and restarting
  2. Using the appropriate FF APIs to have the new DX cert be broadcasted out to everyone in the network

First step, is we could let the expiry time of the cert be configurable in case folks want their certs to last longer than the default 3 months. And then, we'll need to figure out what combination of sidecars, annotations, and Jobs need to be put in place to solve the above.

onelapahead commented 2 years ago

Worth mentioning a few additional caveats related to this issue:

  1. For better or worse, if the Secret containing the TLS certs is updated by cert-manager or some other entity, Kubernetes does not automatically update the certs in the DX pod's volume mount due to our intentional use of subPath: https://github.com/kubernetes/kubernetes/issues/50345. This is "good" because even if the cert is renewed / updated, DX won't start using it until the operator of the chart somehow restarts the pod and then can focused on re-registering the DX certs.
  2. If a Certificate is deleted due to a helm uninstall or other reasons, cert-manager keeps the underlying TLS Secret around to prevent accidental deletion of certificates issued by issuers which might rate-limit cert requests i.e. LetsEncrypt. While that is nice in the accidental deletion case, for us its important to then treat TLS Secrets like DX's PVCs - they are stateful and if you are intentionally deleting your Helm release to reset a FireFly network it is important it is deleted as well.
  3. Another factor to consider is the expiry of the CA issuer's Certificate. If using the CA issuer setup from this repo, the Certificate is currently not setting duration and therefore results in expiring every 90 days. So even if your certs are being renewed or your FireFly networks are being frequently reset, theres the case to consider where the DX certs themselves are still "valid" but the CA expires and that itself has to be renewed, then the DX certs, and then that has to be broadcasted to members in the network.
  4. DX currently has no functionality for reloading its own certificates from disk. Currently it will refresh the CA certs for the other peers periodically and with certain API calls: https://github.com/hyperledger/firefly-dataexchange-https/blob/10b23db379753b83c40da210fb86ceea4226cb42/src/app.ts#L45-L48. And although the TLS context is re-made, the init function for reading the certificates from disk is never called. An API endpoint (similar to Prometheus) for refreshing its certs (and possibly telling FireFly to broadcast the updated certs) would be ideal in this case.
onelapahead commented 2 years ago

As a follow up to 2) from https://github.com/hyperledger/firefly-helm-charts/issues/37#issuecomment-1097262256, it appears cert-manager is configurable around how it handles Secret deletion: https://cert-manager.io/docs/usage/certificate/#cleaning-up-secrets-when-certificates-are-deleted

One can set --enable-certificate-owner-ref if they wish to ensure deleted Certificates delete Secrets as well.