rancher / rancher

Complete container management platform
http://rancher.com
Apache License 2.0
23.4k stars 2.97k forks source link

[BUG] Datastore path is lost on a RKE2 downstream cluster on VMware if the datastore is moved #43339

Open JeanGau-ops opened 1 year ago

JeanGau-ops commented 1 year ago

Global information Rancher version: 2.7.5 Kubernetes version: 1.24.14+rke2r1 Cluster Type Downstream : VMware vSphere RKE2

Describe the bug We ran into a problem on our RKE2 clusters deployed on VMware vSphere. Our infrastructure team moved some datastore (where the nodes were) in subfolders and our clusters lost tracks of it. It had no effect until we tried to upgrade Kubernetes. While editing the cluster, the Datastore selected was the first in the list and not the real one, it had the effect to redeploy the entire cluster even if we did reselect the real Datastore before saving.

To Reproduce

Expected Result The cluster configuration should be able to keep tracks of the datastore. Maybe the datastore should be defined by URL and not by name which is how the CSI is configured for example.

Workaround I didn't try yet The cluster creates a VmwarevsphereConfig which contains the path of the datastore. In my case it's the old path with the subfolder which has been deleted now. Can I just update this yaml with the correct path without risking a full redeployment of the cluster when clicking the save button? And when upgrading the kubernetes version afterwards?

JeanGau-ops commented 12 months ago

OK so I tried to change the value in VmwarevsphereConfig in a test environment and it also redeploys the entire cluster and then the provionning got stuck as mentionned in https://github.com/rancher/rancher/issues/42709 So my "workaround" doesn't work.

Oats87 commented 11 months ago

This one is pretty difficult as it essentially requires reconciliation of the rke-machine-state with the new datastore location.... you can find it as a secret in the local cluster but would need to do some magic to manually manipulate it.

@bfbachmann implemented a feature that allows updating credentials: https://github.com/rancher/rancher/issues/40608 I wonder if it could be expanded to encompass updates like datastore... although I would say it's generally desired that "rancher" command moving the VM from one datastore to another rather than trying to post-reconcile that move as there are a TON of edge cases around this