kubernetes-sigs / vsphere-csi-driver

vSphere storage Container Storage Interface (CSI) plugin
https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/index.html
Apache License 2.0
288 stars 173 forks source link

vSAN witness and topology causes errors in csi provisioner #2901

Closed jcpowermac closed 1 month ago

jcpowermac commented 1 month ago

/kind bug

What happened:

Implementing new vSphere CI environment that includes a vSAN witness for two node vSAN cluster causes csi-provisioner to fail with:

E0520 15:21:58.764707       1 controller.go:957] error syncing claim "07e00f4e-07d9-4013-bf0d-8349f1640440": failed to provision volume with StorageClass "thin-csi": rpc error: code = Internal desc = failed to get shared datastores for topology segments [map[topology.csi.vmware.com/openshift-region:us-east topology.csi.vmware.com/openshift-zone:us-east-1a]] in vCenter "vcenter.ci.ibmc.devcluster.openshift.com". Error: failed to fetch hosts belonging to topology segment map[topology.csi.vmware.com/openshift-region:us-east topology.csi.vmware.com/openshift-zone:us-east-1a]. Error: failed to fetch hosts from entity Datacenter:datacenter-3. Error: failed to fetch hosts from entity {ManagedEntity:{ExtensibleManagedObject:{Self:Datacenter:datacenter-3 Value:[] AvailableField:[]} Parent:<nil> CustomValue:[] OverallStatus: ConfigStatus: ConfigIssue:[] EffectiveRole:[] Permission:[] Name: DisabledMethod:[] RecentTask:[] DeclaredAlarmState:[] TriggeredAlarmState:[] AlarmActionsEnabled:<nil> Tag:[]} VmFolder:: HostFolder:Folder:group-h5 DatastoreFolder:: NetworkFolder:: Datastore:[] Network:[] Configuration:{DynamicData:{} DefaultHardwareVersionKey: MaximumHardwareVersionKey:}}. Error: failed to fetch hosts from entity ComputeResource:domain-s2168. Error: unrecognised entity type found ComputeResource:domain-s2168.

What you expected to happen:

CSI driver to ignore witness esxi appliances.

How to reproduce it (as minimally and precisely as possible):

Always

Anything else we need to know?:

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_release/51894/rehearse-51894-periodic-ci-openshift-release-master-nightly-4.16-e2e-vsphere-zones/1792568254573580288/artifacts/e2e-vsphere-zones/gather-extra/artifacts/pods/openshift-cluster-csi-drivers_vmware-vsphere-csi-driver-controller-77c6f57c6c-hs6bg_csi-provisioner.log

Environment:

jcpowermac commented 1 month ago

image

image

jcpowermac commented 1 month ago

cc: @gnufied @jsafrane

jcpowermac commented 1 month ago

The current work around is to change the permission for the CSI user to No Access on those vCenter objects. Still feels like a bug though

image

shalini-b commented 1 month ago

This issue has been resolved by https://github.com/kubernetes-sigs/vsphere-csi-driver/pull/2685 PR. Is it possible for you to upgrade the CSI driver to v3.2.0?

jcpowermac commented 1 month ago

This issue has been resolved by #2685 PR. Is it possible for you to upgrade the CSI driver to v3.2.0?

Thanks @shalini-b!