Open a13x5 opened 1 day ago
The repro steps suggest that an unused provider has been deleted. According to the mentioned #584 and unmentioned #574 PR where the validation occurs, it is the expected behavior. If the removed provider does not have any clustertemplates
or has some that are utilized by none of managedclusters
objects, then it is allowed to remove the provider and drop helm releases whose names are not equal to the left components (providers) names (excluding the core's releases). There are integration tests for this particular feature of the management-ctrl.
UPD. The assumption regarding the hmc-system
namespace is correct: only the releases in the system namespace are being selected during the removal check, no need to test it.
In particular case: I suppose the vsphere-dev
has not been included in the list of the management.providers
list, hence it is being removed. There are no labels available on the helm releases installed by the template-ctrl, that could narrow the selector.
At the time of implementation, there was no airgap feature, JIC.
As a suggestion: add the new component to the list of the providers (I believe it is why the section exists in the first place), or the template-ctrl could add extra specific labels during create/update of helmreleases
objects, that could be utilized during that removal check in the mgmt-ctrl.
I will disagree.
I'm removing AWS provider and my cluster, created with vSphere provider gets deleted.And then recreated. How this could be expected?
The removal of provider shouldn't result removal of all clusters in hmc-system
namespace (for all providers). This behavior is confusing (to say the least).
And adding managedclusters as a providers in the Management spec is even more confusing.
If we think that this UX is ok, we should explicitly document that behavior, since it's not standard and it's not something that could be expected.
I do not quite understand what you disagree with. I've checked by myself, and there is the only suggestion left from the already mentioned: add extra labels
@zerospiel Sorry, probably I misread the original message - had an impression that you're telling that this is normal expected behavior.
Issue
When changing providers (adding or removing) the helm releases connected to managed clusters are being deleted by
management
controller and then immediately recreated bymanagedcluster
controller.Logs:
Repro steps
Cluster
object is created properlycluster-api-provider-aws
)helmrelease
related to managed cluster. Note that creation time was changed and release was recreated.Additional note: When installed in airgap environment a number of public cloud controllers were not initialized properly (on purpose) by no passing airgap flag. This caused deletion and recreation happening every 10 seconds for all ManagedClusters. Unfortunately I couldn't reproduce it properly.
Conclusions
Most probably caused by #584 . It looks like selector is too broad and we should avoid deleting helm releases related to managed cluster. Also most probably this will not affect managed clusters created in namespaces other than
hmc-system
(not tested).