openshift / openshift-docs

OpenShift 3 and 4 product and community documentation
https://docs.openshift.com
Apache License 2.0
747 stars 1.74k forks source link

Persistent Volumes created by a Storage Class on OpenStack ignore Availability Zones #70191

Closed alexzose closed 1 month ago

alexzose commented 8 months ago

We've setup a new OKD 4.13 cluster on OpenStack and noticed that when creating Persistent Volumes via a Storage Class that allows volume creation in multiple availability zones , then the Persistent Volumes ignore the availability zone and have their nodeAffinity section populated with all the availability zones of the OpenStack instead of the one where they were created.

This has as a consequence, when creating Stateful Sets, the pods being scheduled on different availability zones than their volumes and the pod never manages to mount the volume.

It should be mentioned here that the cross-attach on the OpenStack is not enabled.

After searching the documentation of the CSI cinder driver for OpenStack, it states that the (Optional) option ignore-volume-az is disabled by default, which is not the case for the OKD as we have discovered.

In the ConfigMap "cloud-conf" in the "openshift-cluster-csi-drivers" there is the option "ignore-volume-az = yes" and this causes these issues.

After setting the mentioned option to "ignore-volume-az = no" in the ConfigMap "cloud-provider-config" in the "openshift-config" namespace, this setting is propagated to the ConfigMap "cloud-conf" mentioned above, and the issue with the volumes is resolved.

Then, the volume's nodeAffinity is set correctly to the availability zone it was created and the pods are scheduled in the correct availability zone.

The point is that this behavior is not mentioned in the OKD documentation. We think there should be a note or a warning that this options is set to true on OKD and that if cross-attach is not enabled on OpenStack then this pod / volume availability zone mismatch will occur.

Link to the cinder CSI plugin config: https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#block-storage

openshift-bot commented 3 months ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 2 months ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot commented 1 month ago

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci[bot] commented 1 month ago

@openshift-bot: Closing this issue.

In response to [this](https://github.com/openshift/openshift-docs/issues/70191#issuecomment-2261681255): >Rotten issues close after 30d of inactivity. > >Reopen the issue by commenting `/reopen`. >Mark the issue as fresh by commenting `/remove-lifecycle rotten`. >Exclude this issue from closing again by commenting `/lifecycle frozen`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.