IBM / ibm-spectrum-scale-csi

The IBM Spectrum Scale Container Storage Interface (CSI) project enables container orchestrators, such as Kubernetes and OpenShift, to manage the life-cycle of persistent storage.
Apache License 2.0
66 stars 49 forks source link

Different types of shallow copies are not supported for version 2 PVC as source volume #1075

Closed saurabhwani5 closed 9 months ago

saurabhwani5 commented 10 months ago

Describe the bug

When we try to create light weight shallow copy from the version 2 pvc, in this scenario shallow copy volume gets bound but pod remains in ContainerCreating state of the shallow copy volume with following error : MountVolume.SetUp failed for volume "pvc-1b0edc74-e3dd-43ef-9dd3-e32ff02d1c85" : rpc error: code = Unknown desc = NodePublishVolume - lstat [/host/ibm/fs1/pvc-75138518-1040-45e2-bc44-292c3630d80d/.snapshots/snapshot-111e3120-a2a4-4c10-bde7-b1a00ac6b8e7//] failed with error [lstat /host/ibm/fs1/pvc-75138518-1040-45e2-bc44-292c3630d80d/.snapshots/snapshot-111e3120-a2a4-4c10-bde7-b1a00ac6b8e7//: no such file or directory]

How to Reproduce?

Please list the steps to help development teams reproduce the behavior

  1. Install CSI with #1067 images as following :

    [root@saurabhrhel8-master Upgradetesting]# oc get pods
    NAME                                                  READY   STATUS    RESTARTS        AGE
    ibm-spectrum-scale-csi-attacher-869bd7ff6d-df4dz      1/1     Running   2 (70m ago)     5h14m
    ibm-spectrum-scale-csi-attacher-869bd7ff6d-qxrpp      1/1     Running   0               5h14m
    ibm-spectrum-scale-csi-kbtnj                          3/3     Running   0               5h14m
    ibm-spectrum-scale-csi-operator-865c5885b-922dz       1/1     Running   0               5h14m
    ibm-spectrum-scale-csi-provisioner-c48d8df47-bm6xl    1/1     Running   1 (74m ago)     5h14m
    ibm-spectrum-scale-csi-resizer-54c67667c4-dsqhm       1/1     Running   0               5h14m
    ibm-spectrum-scale-csi-snapshotter-6f4964bd9b-xpkng   1/1     Running   1 (3h15m ago)   5h14m
    ibm-spectrum-scale-csi-xmszv                          3/3     Running   0               5h14m
    [root@saurabhrhel8-master Upgradetesting]# oc describe pod | grep quay
    Image:         quay.io/hemalatha_gajendran/driver_shallowcopy_cgfix
    Image ID:      quay.io/hemalatha_gajendran/driver_shallowcopy_cgfix@sha256:804188c7e27839a4150d2d976d0115a58be877c281c3f877568dca00c831688b
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-operator@sha256:18264e0c9c112856bc2744f7f971a4b60ecb24de57e46af4d35456dcdf8e3cbf
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-operator@sha256:18264e0c9c112856bc2744f7f971a4b60ecb24de57e46af4d35456dcdf8e3cbf
      CSI_DRIVER_IMAGE:      quay.io/hemalatha_gajendran/driver_shallowcopy_cgfix
    Image:         quay.io/hemalatha_gajendran/driver_shallowcopy_cgfix
    Image ID:      quay.io/hemalatha_gajendran/driver_shallowcopy_cgfix@sha256:804188c7e27839a4150d2d976d0115a58be877c281c3f877568dca00c831688b
    [root@saurabhrhel8-master Upgradetesting]# oc get cso
    NAME                     VERSION   SUCCESS
    ibm-spectrum-scale-csi   2.11.0    True
  2. Create version 2 sc,pvc and pod from following yamls:

    
    apiVersion: v1
    kind: Pod
    metadata:
    name: csi-scale-fsetdemo-pod-1
    labels:
    app: nginx
    spec:
    containers:
    - name: web-server
     image: nginx
     volumeMounts:
       - name: mypvc
         mountPath: /usr/share/nginx/html/scale
     ports:
     - containerPort: 80
    volumes:
    - name: mypvc
     persistentVolumeClaim:
       claimName: scale-advance-pvc-1
       readOnly: false

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: scale-advance-pvc-1 spec: accessModes:


apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: ibm-spectrum-scale-csi-advance provisioner: spectrumscale.csi.ibm.com parameters: volBackendFs: "fs1" version: "2" reclaimPolicy: Delete

[root@saurabhrhel8-master Upgradetesting]# oc get pods NAME READY STATUS RESTARTS AGE csi-scale-fsetdemo-pod-1 1/1 Running 0 9m49s [root@saurabhrhel8-master Upgradetesting]# oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE scale-advance-pvc-1 Bound pvc-75138518-1040-45e2-bc44-292c3630d80d 1Gi RWX ibm-spectrum-scale-csi-advance 9m51s

3. write data and take snapshot of the same pvc:

root@csi-scale-fsetdemo-pod-1:/usr/share/nginx/html/scale# touch test{1..100} root@csi-scale-fsetdemo-pod-1:/usr/share/nginx/html/scale# ls test1 test12 test16 test2 test23 test27 test30 test34 test38 test41 test45 test49 test52 test56 test6 test63 test67 test70 test74 test78 test81 test85 test89 test92 test96 test10 test13 test17 test20 test24 test28 test31 test35 test39 test42 test46 test5 test53 test57 test60 test64 test68 test71 test75 test79 test82 test86 test9 test93 test97 test100 test14 test18 test21 test25 test29 test32 test36 test4 test43 test47 test50 test54 test58 test61 test65 test69 test72 test76 test8 test83 test87 test90 test94 test98 test11 test15 test19 test22 test26 test3 test33 test37 test40 test44 test48 test51 test55 test59 test62 test66 test7 test73 test77 test80 test84 test88 test91 test95 test99 root@csi-scale-fsetdemo-pod-1:/usr/share/nginx/html/scale#

Snapshot yamls: apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: name: ibm-spectrum-scale-snapshot-1 spec: volumeSnapshotClassName: ibm-spectrum-scale-snapshotclass-advance-1 source: persistentVolumeClaimName: scale-advance-pvc-1

apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotClass metadata: name: ibm-spectrum-scale-snapshotclass-advance-1 driver: spectrumscale.csi.ibm.com parameters: snapWindow: "30" #Optional : Time in minutes (default=30) deletionPolicy: Delete [root@saurabhrhel8-master Upgradetesting]# oc apply -f snapapply.yaml volumesnapshot.snapshot.storage.k8s.io/ibm-spectrum-scale-snapshot-1 created volumesnapshotclass.snapshot.storage.k8s.io/ibm-spectrum-scale-snapshotclass-advance-1 created


4. try to create shallow copy and pod from snapshot with storage class as light weight :

[root@saurabhrhel8-master Upgradetesting]# cat restoresnap.yaml apiVersion: v1 kind: Pod metadata: name: csi-scale-fsetdemo-pod-snapshot-1 labels: app: nginx spec: containers:

[root@saurabhrhel8-master Upgradetesting]# oc apply -f restoresnap.yaml pod/csi-scale-fsetdemo-pod-snapshot-1 created persistentvolumeclaim/ibm-spectrum-scale-pvc-from-snapshot-1 created storageclass.storage.k8s.io/ibm-spectrum-scale-csi-lw created


5. check the pod logs :

[root@saurabhrhel8-master Upgradetesting]# oc get pvc -w NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ibm-spectrum-scale-pvc-from-snapshot-1 Pending ibm-spectrum-scale-csi-lw 17s scale-advance-pvc-1 Bound pvc-75138518-1040-45e2-bc44-292c3630d80d 1Gi RWX ibm-spectrum-scale-csi-advance 20m ibm-spectrum-scale-pvc-from-snapshot-1 Pending pvc-1b0edc74-e3dd-43ef-9dd3-e32ff02d1c85 0 ibm-spectrum-scale-csi-lw 55s ibm-spectrum-scale-pvc-from-snapshot-1 Bound pvc-1b0edc74-e3dd-43ef-9dd3-e32ff02d1c85 1Gi ROX ibm-spectrum-scale-csi-lw 55s [root@saurabhrhel8-master Upgradetesting]# oc get pods -w NAME READY STATUS RESTARTS AGE csi-scale-fsetdemo-pod-1 1/1 Running 0 21m csi-scale-fsetdemo-pod-snapshot-1 0/1 ContainerCreating 0 64s [root@saurabhrhel8-master Upgradetesting]# oc describe pod csi-scale-fsetdemo-pod-snapshot-1 Name: csi-scale-fsetdemo-pod-snapshot-1 Namespace: ibm-spectrum-scale-csi-driver Priority: 0 Service Account: default Node: saurabhrhel8-worker-1.fyre.ibm.com/10.11.103.18 Start Time: Thu, 21 Dec 2023 03:00:15 -0800 Labels: app=nginx Annotations: Status: Pending IP: IPs: Containers: web-server: Container ID: Image: nginx Image ID: Port: 80/TCP Host Port: 0/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: Mounts: /usr/share/nginx/html/scale from mypvc (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9fctb (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: mypvc: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: ibm-spectrum-scale-pvc-from-snapshot-1 ReadOnly: false kube-api-access-9fctb: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Warning FailedScheduling 107s default-scheduler 0/3 nodes are available: persistentvolumeclaim "ibm-spectrum-scale-pvc-from-snapshot-1" not found. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.. Warning FailedScheduling 105s default-scheduler 0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.. Warning FailedScheduling 51s (x2 over 93s) default-scheduler 0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.. Normal Scheduled 43s default-scheduler Successfully assigned ibm-spectrum-scale-csi-driver/csi-scale-fsetdemo-pod-snapshot-1 to saurabhrhel8-worker-1.fyre.ibm.com Normal SuccessfulAttachVolume 38s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-1b0edc74-e3dd-43ef-9dd3-e32ff02d1c85" Warning FailedMount 4s (x7 over 36s) kubelet MountVolume.SetUp failed for volume "pvc-1b0edc74-e3dd-43ef-9dd3-e32ff02d1c85" : rpc error: code = Unknown desc = NodePublishVolume - lstat [/host/ibm/fs1/pvc-75138518-1040-45e2-bc44-292c3630d80d/.snapshots/snapshot-111e3120-a2a4-4c10-bde7-b1a00ac6b8e7//] failed with error [lstat /host/ibm/fs1/pvc-75138518-1040-45e2-bc44-292c3630d80d/.snapshots/snapshot-111e3120-a2a4-4c10-bde7-b1a00ac6b8e7//: no such file or directory]



## Expected behavior
Error message should be given while creating  the shallow copy light weight PVC if we are not supporting this or Pod should be in running state

## Logs :
/scale-csi/D.1075
csisnap.tar.gz
saurabhwani5 commented 10 months ago

Issue is also seen when shallow copy volume is independent and dependent volume of version 1 where source pvc is of version 2

hemalathagajendran commented 9 months ago

Image: quay.io/hemalatha_gajendran/driver_shallowcopy_validation

saurabhwani5 commented 9 months ago

issue is fixed , we have added the validation where only source volume version 2 to version 2 shallow copy creation is possible where other than that following error message is shown

 'message': 'failed to provision volume with StorageClass '
            '"restore-sc-nuimnaz": rpc error: code = Internal desc = '
            'CreateVolume ValidateShallowCopyVolume failed',

Thanks @hemalathagajendran for fix !!