cilium / cilium-cli

CLI to install, manage & troubleshoot Kubernetes clusters running Cilium
https://cilium.io
Apache License 2.0
417 stars 210 forks source link

Cilium uninstall in Helm mode needs to clean up `certgen`-created resources #1648

Open asauber opened 1 year ago

asauber commented 1 year ago

When using cilium clustermesh enable, cilium install, or cilium upgrade in Helm mode, it's possible to specify cronJob as the TLS mode (a.k.a. certgen). This indirectly causes secrets to be created which are not owned by Helm.

If later cilium uninstall is used against the same cluster, those secrets are not removed.

Later still, if cilium upgrade is used with a different TLS mode, then Helm will bail entirely with:

Error: Unable to enable ClusterMesh: rendered manifests contain a resource that already exists.
Unable to continue with update: Secret "clustermesh-apiserver-admin-cert" in namespace "kube-system" exists and cannot be imported into the current release: 
invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; 
annotation validation error: missing key "meta.helm.sh/release-name": must be set to "cilium"; 
annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kube-system"

cc @michi-covalent

michi-covalent commented 1 year ago

that's interesting. cc @gandro to get some clarification in terms of who owns these resources 👀

gandro commented 1 year ago

that's interesting. cc @gandro to get some clarification in terms of who owns these resources eyes

Hm. Very good question. I would say that the resources are owned by certgen. The problem is that it does not have a cleanup functionality.

That means that if we want helm uninstall removes those resources, we probably would want to have to have helm uninstall (maybe via some hook magic?) invoke a (to be written) certgen uninstall job?

I'd also be fine with helm uninstall just somehow deleting those resources directly (if that's even possible with Helm). But that does have the downside that we would have to synchronize the logic between what certgen creates and what helm deletes.

github-actions[bot] commented 2 days ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.