clastix / kamaji

Kamaji is the Hosted Control Plane Manager for Kubernetes.
https://kamaji.clastix.io
Apache License 2.0
1.12k stars 102 forks source link

fix: don't delete coredns unless previously managed by kamaji #527

Closed Marlinc closed 1 month ago

Marlinc commented 2 months ago

This is still a draft

netlify[bot] commented 2 months ago

Deploy Preview for kamaji-documentation canceled.

Name Link
Latest commit 249280183b9a06ef95ac1fd88f96284baea2a164
Latest deploy log https://app.netlify.com/sites/kamaji-documentation/deploys/66c727341cc94d00089fbef7
prometherion commented 2 months ago

Several adopters reported this, and it's a neat feature/bug.

I'll be happy to have a different approach, tho: the Resource Handler is an interface with the following functions.

https://github.com/clastix/kamaji/blob/d8dfc62794acca202d6279e46a09268fb5d9fef8/internal/resources/resource.go#L23-L31

The function ShouldCleanup is the one responsible for deleting the CoreDNS, as well as other add-ons.

https://github.com/clastix/kamaji/blob/d8dfc62794acca202d6279e46a09268fb5d9fef8/internal/resources/addons/coredns.go#L77-L79

For the given CoreDNS, we're just checking if the Specification addons are set to nil, but this is not efficient and not idempotent.

Since the TenantControlPlane API is storing the status, we could rely on a different condition here, such as:

  return tcp.Spec.Addons.CoreDNS == nil && tcp.Status.Addons.CoreDNS.Enabled

The said condition should be true when CoreDNS is disabled in the spec, but the status still reports an enabled CoreDNS deployment.

Upon completion of the deletion here with an Updated resource, a status update is performed, meaning we're ending up with the correct reporting

https://github.com/clastix/kamaji/blob/d8dfc62794acca202d6279e46a09268fb5d9fef8/internal/resources/addons/coredns.go#L186-L191

The same logic could be used for the other addons, maybe we could have some trouble with the konnectivity one since it is much more complicated.