Open govinda-attal opened 1 month ago
This issue is currently awaiting triage.
If CAPI contributors determine this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
Hm not sure what is going on here. We use the regular patch in a lot of our controllers, and if I remember correctly they never try to unset labels/annotations set by other users/controllers.
(so wondering what we do differently here)
looping in @g-gaston who implemented the reconcile mode for CRS.
Personally, I stick to the idea that implementing a sophisticated addon management solution is out of scope of Cluster API. CRS was always intented as simple, stop gap solution, and IMO the way forward is not to make CRS more complex but to bridge towards what we describe in https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20220712-cluster-api-addon-orchestration.md
For context: when we first implemented this we decided to not use server side apply given we still supported clusters < 1.22. Now that the min version is 1.24, we could revisit this decision.
That said, I'm with @fabriziopandini on this. Given the complexity of the usecase (capi manages creation and update of resources but with other entities partially managing those resources as well), I'm not sure if CSR is the best tool for the job. It was never designed to deal with the complexity of multiple owners and specially not with dynamic admission controllers.
We already have an approved addon proposal (which we didn't have when we implemented updated for CSRs) and if I'm not mistaken we also have a first implementation using helm. I would suggest looking into that as an alternative solution.
Another suggestion, if you already have "secondary" pipelines that maintain the state of the resources in the workload cluster post-creation, why don't you set the CSR in the "create" only mode and let your pipelines manage any future updates of those resources?
What steps did you take and what happened?
We use capi to provision workload clusters with minimal resources. Once the clusters are ready we use different approach/pipeline to manage other applications on them.
We are using cluster resource sets to create minimal resources on workload clusters - few namespaces with limited resources (such as config-maps and secrets).
On first pass during early stages of building workload clusters this works as expected and cluster resource sets are reconciled.
Once workload clusters are ready, other pipelines trigger and continuously maintain state of the workload cluster. In these pipelines we install controllers such as hierarchical-namespaces; which applies certain labels
<namespace-name>.tree.hnc.x-k8s.io/depth=0
on namespace and govern other resources on workload clusters as per policies (captured using hnc-validating-webhook-configuration, etc).With hnc in place validating webhook will validate operations being applied on resources.
On next run when capi tries to reconcile resource cluster set by applying resources as per definition it holds (which doesn't have the labels specific to hnc).
Here hnc's
validatingwebhookconfiguration
kick in and it rejects this apply operation for capi.This results in ClusterResourceSet reconciliation failure with status as
Referring to apply functions relevant to cluster resource set
It seems it is doing basic patch here. It will be helpful to have server side apply.
server-side-apply/#merge-strategy
What did you expect to happen?
capi cluster-resource-set controller and other controllers such as hnc are able to live in perfect harmony.
Cluster API version
main branch
Kubernetes version
1.28
Anything else you would like to add?
No response
Label(s) to be applied
/kind bug One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.