Closed dan-nawrocki closed 2 months ago
Could you elaborate on how this VM was created? Have you considered these steps when setting up your OS images?
Yes, I have completed the steps you mentioned. That being said, my VM disk was NOT created from the openshift-virtualization-os-images. It's an export of a VM we had setup in RedHat Virtualization. My steps to get here are a bit convoluted:
This won't work as we don't support RWO to RWX transformation. We're sort of treating this as a bug at the moment and it will hopefully be fixed in the next version a few months out.
You can however fix this if you're handy with REST APIs. You need to set multi_initiator: true
on the backend volume. I believe there's a Nimble CLI way to do this as well but I'm not sure.
Could I export my RWO volume and re-import it as RWX?
Once RWO, RWX is a no go unfortunately. The same bug applies.
I just updated to CSI driver 2.4.2 and have the same problem.
I've figured out how to set multi_initiator
, however, it appears that this parameter only applies to iSCSI volumes. I'm using FC and getting this error when setting the flag:
curl -k -H 'X-Auth-Token: REDACTED' https://my-nimble-host:5392/v1/volumes/my-vol-id -d '{"data": {"multi_initiator": true}}' -X PUT
{"messages":[{"code":"SM_http_bad_request","severity":"error","text":"The request could not be understood by the server."},{"code":"SM_unexpected_arg","severity":"error","arguments":{"arg":"multi_initiator"},"text":"Unexpected argument 'multi_initiator'."}]}
I did notice that volumes I manually create on the Nimble have multi_initiator
set to true
.
I tried to import the disk to a new PVC w/ RWX mode, but multi_initiator
was still false on the newly-created Nimble volume too.
virtctl image-upload pvc rwx-test-2 --size=64Gi --image-path=rhel8-template.img.gz --block-volume --storage-class=nimble-san-sc --access-mode ReadWriteMany
It looks like https://github.com/hpe-storage/csi-driver/pull/40/files should create new volumes w/ multi_initiator
set to true
.
This is very strange. I copied and pasted your curl
command into my environment and I can flick multi_initiator
back and forth true/false no problem on an FC array with a blank unattached volume. Does your volume have any connections to it? I added a dummy initiator to my volume and that didn't matter.
There are no connections on my volume. I get the same error whether or not the volume is online. I can toggle the online state using curl, so it's not an obvious dumb error on my part.
What do you mean by "blank unattached volume"? I have a PVC (rwx) bound to the PV, however, the VM is turned off so the PVC isn't in active use.
I've got a Nimble with NimbleOS 6.1.2.400-1048557-opt in case that matters.
Your curl
is spot on, I copied and pasted it no problem. I'm using NimbleOS 6.0.0 in my case and I can't see why this would've changed. I'll upgrade my array and see where it goes.
I updated my array. No change. Can we dig out some logs what's actually happening on your cluster?
The CSP logs should reveal some clues, oc logs -nhpe-storage deploy/nimble-csp
.
Also now confirmed on OCP 4.14 that the following combination indeed set multi_initiator: true
on the volume. Using CSI Operator v2.4.2.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "false"
name: hpe-standard-fc
parameters:
csi.storage.k8s.io/controller-expand-secret-name: hpe-backend-fc
csi.storage.k8s.io/controller-expand-secret-namespace: hpe-storage
csi.storage.k8s.io/controller-publish-secret-name: hpe-backend-fc
csi.storage.k8s.io/controller-publish-secret-namespace: hpe-storage
csi.storage.k8s.io/fstype: xfs
csi.storage.k8s.io/node-publish-secret-name: hpe-backend-fc
csi.storage.k8s.io/node-publish-secret-namespace: hpe-storage
csi.storage.k8s.io/node-stage-secret-name: hpe-backend-fc
csi.storage.k8s.io/node-stage-secret-namespace: hpe-storage
csi.storage.k8s.io/provisioner-secret-name: hpe-backend-fc
csi.storage.k8s.io/provisioner-secret-namespace: hpe-storage
description: Volume created by the HPE CSI Driver for Kubernetes
destroyOnDelete: "true"
accessProtocol: fc
provisioner: csi.hpe.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-first-fc-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 32Gi
storageClassName: hpe-standard-fc
volumeMode: Block
I updated to OpenShift 4.14; same behavior.
I'm starting to think this is a Nimble problem. I created a brand-new volume using the Nimble UI, and I can't seem to get multi_initiator
set to true
. I've tried various data protection options and access options, but nothing seems to make multi_initiator
set to true
.
Yeah, I think so too. Nimble support should be able to add some clarity here.
I created a ticket, but think I beat them to the punch :)
Turns out iSCSI has to be enabled, even for FC-only configurations. Once I ran group --edit --iscsi_enabled yes
on the Nimble, I can confirm that new PVCs have multi_initiator
set correctly. I'm going to assume that a software update at some point turned iSCSI off since my very old volumes are multi-initiator.
A quick test showed that live migration is working now. Thanks for the help!
Thanks for confirming. What an obscure finding. We need to document this.
I am using the CSI driver 2.4.1 with OpenShift 4.12.37, OpenShift Virtualization 4.12.10, and NimbleOS 6.1.2.400. I have configured the live migration settings and am using the ReadWriteMany access mode for Block volumes. The setup appears to work correctly, however, after a live migration, the CSI appears to make my volume Offline and the VM moves to the Paused state. When I manually set the volume Online, the VM resumes successfully.
Here's my steps:
Any ideas on how to prevent the volume from being set offline?