The IBM Spectrum Scale Container Storage Interface (CSI) project enables container orchestrators, such as Kubernetes and OpenShift, to manage the life-cycle of persistent storage.
Apache License 2.0
66
stars
49
forks
source link
race condition happens when shallow copy pvc and snapshot are deleted together #1105
When shallow copy volume and snapshot are deleted together in that scenario race condition happens where in some cases pv does not get deleted of shallow copy volume
As per my understanding this is happening because of following reason :
when we delete shallow copy volume and snapshot together , shallow copy volume deletes the shallow copy directory created in snapshot directory and when snapshot tries to delete then it got deleted as there is no shallow copy directory
But in shallow copy deletion snapshot directory path is checked afterwards then it gives error as following :
Warning VolumeFailedDelete 14s (x2 over 34s) spectrumscale.csi.ibm.com_ibm-spectrum-scale-csi-provisioner-5864cc55bb-bmblx_c279e8db-bccd-4a21-b10e-b92f588bd2e8 rpc error: code = Unknown desc = unable to stat dir 36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb:[EFSSG0264C The path /ibm/fs1/36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb does not exist.]
How to Reproduce?
Please list the steps to help development teams reproduce the behavior
[root@saurabh29-master Upgradetesting]# oc get vs -w
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
ibm-spectrum-scale-snapshot true scale-advance-pvc-1 1Gi ibm-spectrum-scale-snapshotclass-advance snapcontent-564125d8-6d91-42fc-8ee5-dd869cf8eedb 26s 50s
^C[root@saurabh29-master Upgradetesting]# oc get pvc -w
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
ibm-spectrum-scale-pvc-from-snapshot-2 Pending ibm-spectrum-scale-csi-advance 50s
scale-advance-pvc-1 Bound pvc-280bbb6d-f725-4691-adbb-5a768a66705f 1Gi RWX ibm-spectrum-scale-csi-advance 9m14s
ibm-spectrum-scale-pvc-from-snapshot-2 Pending pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9 0 ibm-spectrum-scale-csi-advance 90s
ibm-spectrum-scale-pvc-from-snapshot-2 Bound pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9 1Gi ROX ibm-spectrum-scale-csi-advance 90s
5. Delete the shallow copy volume and snapshot together:
[root@saurabh29-master Upgradetesting]# cat del.sh
oc delete pvc ibm-spectrum-scale-pvc-from-snapshot-2
oc delete vs ibm-spectrum-scale-snapshot --force
[root@saurabh29-master Upgradetesting]# bash del.sh
persistentvolumeclaim "ibm-spectrum-scale-pvc-from-snapshot-2" deleted
Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
volumesnapshot.snapshot.storage.k8s.io "ibm-spectrum-scale-snapshot" force deleted
6. check the pv description
[root@saurabh29-master ~]# oc get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
pvc-280bbb6d-f725-4691-adbb-5a768a66705f 1Gi RWX Delete Bound ibm-spectrum-scale-csi-driver/scale-advance-pvc-1 ibm-spectrum-scale-csi-advance 23m
pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9 1Gi ROX Delete Released ibm-spectrum-scale-csi-driver/ibm-spectrum-scale-pvc-from-snapshot-2 ibm-spectrum-scale-csi-advance 14m
[root@saurabh29-master ~]# oc describe pv pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9
Name: pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9
Labels:
Annotations: pv.kubernetes.io/provisioned-by: spectrumscale.csi.ibm.com
volume.kubernetes.io/provisioner-deletion-secret-name:
volume.kubernetes.io/provisioner-deletion-secret-namespace:
Finalizers: [kubernetes.io/pv-protection]
StorageClass: ibm-spectrum-scale-csi-advance
Status: Released
Claim: ibm-spectrum-scale-csi-driver/ibm-spectrum-scale-pvc-from-snapshot-2
Reclaim Policy: Delete
Access Modes: ROX
VolumeMode: Filesystem
Capacity: 1Gi
Node Affinity:
Message:
Source:
Type: CSI (a Container Storage Interface (CSI) volume source)
Driver: spectrumscale.csi.ibm.com
FSType: gpfs
VolumeHandle: 1;3;14016324136648177722;BB4A0B0A:65A5F92B;36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver;pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9;/ibm/fs1/36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/.snapshots/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb/pvc-280bbb6d-f725-4691-adbb-5a768a66705f
ReadOnly: false
VolumeAttributes: csi.storage.k8s.io/pv/name=pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9
csi.storage.k8s.io/pvc/name=ibm-spectrum-scale-pvc-from-snapshot-2
csi.storage.k8s.io/pvc/namespace=ibm-spectrum-scale-csi-driver
storage.kubernetes.io/csiProvisionerIdentity=1709092347497-7554-spectrumscale.csi.ibm.com
version=2
volBackendFs=fs1
Events:
Type Reason Age From Message
Warning VolumeFailedDelete 52s spectrumscale.csi.ibm.com_ibm-spectrum-scale-csi-provisioner-5864cc55bb-bmblx_c279e8db-bccd-4a21-b10e-b92f588bd2e8 rpc error: code = Internal desc = unable to Delete shallow copy reference parent dir using FS [fs1] Error [unable to delete dir 36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb:[EFSSG0264C The path /ibm/fs1/36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb does not exist.]]
Warning VolumeFailedDelete 14s (x2 over 34s) spectrumscale.csi.ibm.com_ibm-spectrum-scale-csi-provisioner-5864cc55bb-bmblx_c279e8db-bccd-4a21-b10e-b92f588bd2e8 rpc error: code = Unknown desc = unable to stat dir 36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb:[EFSSG0264C The path /ibm/fs1/36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb does not exist.]
## Expected behavior
Shallow copy pv should be deleted if snapshot is deleted earlier
### Data Collection and Debugging
Describe the bug
When shallow copy volume and snapshot are deleted together in that scenario race condition happens where in some cases pv does not get deleted of shallow copy volume As per my understanding this is happening because of following reason :
when we delete shallow copy volume and snapshot together , shallow copy volume deletes the shallow copy directory created in snapshot directory and when snapshot tries to delete then it got deleted as there is no shallow copy directory But in shallow copy deletion snapshot directory path is checked afterwards then it gives error as following :
Warning VolumeFailedDelete 14s (x2 over 34s) spectrumscale.csi.ibm.com_ibm-spectrum-scale-csi-provisioner-5864cc55bb-bmblx_c279e8db-bccd-4a21-b10e-b92f588bd2e8 rpc error: code = Unknown desc = unable to stat dir 36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb:[EFSSG0264C The path /ibm/fs1/36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb does not exist.]
How to Reproduce?
Please list the steps to help development teams reproduce the behavior
Install CSI 2.11.0 with DCUT images
Create PVC as following :
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: scale-advance-pvc-1 spec: accessModes:
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: ibm-spectrum-scale-csi-advance provisioner: spectrumscale.csi.ibm.com parameters: volBackendFs: "fs1" version: "2" reclaimPolicy: Delete
[root@saurabh29-master Upgradetesting]# cat snapshot.yaml apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: name: ibm-spectrum-scale-snapshot spec: volumeSnapshotClassName: ibm-spectrum-scale-snapshotclass-advance source: persistentVolumeClaimName: scale-advance-pvc-1
apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotClass metadata: name: ibm-spectrum-scale-snapshotclass-advance driver: spectrumscale.csi.ibm.com parameters: snapWindow: "30" #Optional : Time in minutes (default=30) deletionPolicy: Delete
[root@saurabh29-master Upgradetesting]# cat restore.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ibm-spectrum-scale-pvc-from-snapshot-2 spec: accessModes:
[root@saurabh29-master Upgradetesting]# oc get vs -w NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE ibm-spectrum-scale-snapshot true scale-advance-pvc-1 1Gi ibm-spectrum-scale-snapshotclass-advance snapcontent-564125d8-6d91-42fc-8ee5-dd869cf8eedb 26s 50s ^C[root@saurabh29-master Upgradetesting]# oc get pvc -w NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE ibm-spectrum-scale-pvc-from-snapshot-2 Pending ibm-spectrum-scale-csi-advance 50s
scale-advance-pvc-1 Bound pvc-280bbb6d-f725-4691-adbb-5a768a66705f 1Gi RWX ibm-spectrum-scale-csi-advance 9m14s
ibm-spectrum-scale-pvc-from-snapshot-2 Pending pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9 0 ibm-spectrum-scale-csi-advance 90s
ibm-spectrum-scale-pvc-from-snapshot-2 Bound pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9 1Gi ROX ibm-spectrum-scale-csi-advance 90s
[root@saurabh29-master Upgradetesting]# cat del.sh oc delete pvc ibm-spectrum-scale-pvc-from-snapshot-2 oc delete vs ibm-spectrum-scale-snapshot --force [root@saurabh29-master Upgradetesting]# bash del.sh persistentvolumeclaim "ibm-spectrum-scale-pvc-from-snapshot-2" deleted Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. volumesnapshot.snapshot.storage.k8s.io "ibm-spectrum-scale-snapshot" force deleted
[root@saurabh29-master ~]# oc get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE pvc-280bbb6d-f725-4691-adbb-5a768a66705f 1Gi RWX Delete Bound ibm-spectrum-scale-csi-driver/scale-advance-pvc-1 ibm-spectrum-scale-csi-advance 23m
pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9 1Gi ROX Delete Released ibm-spectrum-scale-csi-driver/ibm-spectrum-scale-pvc-from-snapshot-2 ibm-spectrum-scale-csi-advance 14m
[root@saurabh29-master ~]# oc describe pv pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9
Name: pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9
Labels:
Annotations: pv.kubernetes.io/provisioned-by: spectrumscale.csi.ibm.com
volume.kubernetes.io/provisioner-deletion-secret-name:
volume.kubernetes.io/provisioner-deletion-secret-namespace:
Finalizers: [kubernetes.io/pv-protection]
StorageClass: ibm-spectrum-scale-csi-advance
Status: Released
Claim: ibm-spectrum-scale-csi-driver/ibm-spectrum-scale-pvc-from-snapshot-2
Reclaim Policy: Delete
Access Modes: ROX
VolumeMode: Filesystem
Capacity: 1Gi
Node Affinity:
Message:
Source:
Type: CSI (a Container Storage Interface (CSI) volume source)
Driver: spectrumscale.csi.ibm.com
FSType: gpfs
VolumeHandle: 1;3;14016324136648177722;BB4A0B0A:65A5F92B;36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver;pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9;/ibm/fs1/36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/.snapshots/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb/pvc-280bbb6d-f725-4691-adbb-5a768a66705f
ReadOnly: false
VolumeAttributes: csi.storage.k8s.io/pv/name=pvc-bc4e2535-517e-44c9-a2ee-f8bdf36755d9
csi.storage.k8s.io/pvc/name=ibm-spectrum-scale-pvc-from-snapshot-2
csi.storage.k8s.io/pvc/namespace=ibm-spectrum-scale-csi-driver
storage.kubernetes.io/csiProvisionerIdentity=1709092347497-7554-spectrumscale.csi.ibm.com
version=2
volBackendFs=fs1
Events:
Type Reason Age From Message
Warning VolumeFailedDelete 52s spectrumscale.csi.ibm.com_ibm-spectrum-scale-csi-provisioner-5864cc55bb-bmblx_c279e8db-bccd-4a21-b10e-b92f588bd2e8 rpc error: code = Internal desc = unable to Delete shallow copy reference parent dir using FS [fs1] Error [unable to delete dir 36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb:[EFSSG0264C The path /ibm/fs1/36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb does not exist.]] Warning VolumeFailedDelete 14s (x2 over 34s) spectrumscale.csi.ibm.com_ibm-spectrum-scale-csi-provisioner-5864cc55bb-bmblx_c279e8db-bccd-4a21-b10e-b92f588bd2e8 rpc error: code = Unknown desc = unable to stat dir 36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb:[EFSSG0264C The path /ibm/fs1/36c17a30-3ab4-4625-9816-3af4b6a92b58-ibm-spectrum-scale-csi-driver/snapshot-564125d8-6d91-42fc-8ee5-dd869cf8eedb does not exist.]
/scale-csi/D.1105 csisnap.tar.gz