IBM / ibm-spectrum-scale-csi

The IBM Spectrum Scale Container Storage Interface (CSI) project enables container orchestrators, such as Kubernetes and OpenShift, to manage the life-cycle of persistent storage.
Apache License 2.0
65 stars 49 forks source link

Shallow copy pod on OCP with RHEL worker nodes environment stays in CreateContainerError State #1127

Open saurabhwani5 opened 4 months ago

saurabhwani5 commented 4 months ago

Describe the bug

When I am trying to create shallow copy pod from shallow copy volume then it is giving CreateContainerError on ocp with rhel setup with error as :

How to Reproduce?

  1. Install CSI GM build on ocp with rhel worker node cluster as following :
    [root@ocp1-helper Upgradetesting]# oc get nodes
    NAME                   STATUS   ROLES                  AGE   VERSION
    master0.ocp1.vmlocal   Ready    control-plane,master   69d   v1.26.9+07c0911
    master1.ocp1.vmlocal   Ready    control-plane,master   69d   v1.26.9+07c0911
    master2.ocp1.vmlocal   Ready    control-plane,master   69d   v1.26.9+07c0911
    worker0.ocp1.vmlocal   Ready    worker                 69d   v1.26.9+07c0911
    worker1.ocp1.vmlocal   Ready    worker                 69d   v1.26.9+07c0911
    worker2.ocp1.vmlocal   Ready    worker                 68d   v1.26.12+dedb61b
    worker3.ocp1.vmlocal   Ready    worker                 68d   v1.26.12+dedb61b
    worker4.ocp1.vmlocal   Ready    worker                 68d   v1.26.12+dedb61b
    [root@ocp1-helper Upgradetesting]# oc get cso
    NAME                     VERSION   SUCCESS
    ibm-spectrum-scale-csi   2.11.0    True
    [root@ocp1-helper Upgradetesting]# oc describe pod | grep quay
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.11.0-GM
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:b2bc343eadbc11d9ed74a8477d2cd0a7a8460a72203d3f6236d4662e68df1166
    Normal  Pulled     73m   kubelet            Container image "quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.11.0-GM" already present on machine
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.11.0-GM
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:b2bc343eadbc11d9ed74a8477d2cd0a7a8460a72203d3f6236d4662e68df1166
    Normal  Pulled     73m   kubelet            Container image "quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.11.0-GM" already present on machine
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-operator:v2.11.0-GM
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-operator@sha256:97adea43e18091d62a6cc6049106c6fbd860e62d6ccd952c98b626a6bb78fb92
      CSI_DRIVER_IMAGE:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.11.0-GM
    Normal  Pulling         55m   kubelet            Pulling image "quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-operator:v2.11.0-GM"
    Normal  Pulled          55m   kubelet            Successfully pulled image "quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-operator:v2.11.0-GM" in 3.119739451s (3.119755431s including waiting)
  2. Create pvc and pod :
    
    [root@ocp1-helper Upgradetesting]# oc apply -f apply.yaml
    pod/csi-scale-fsetdemo-pod-2 created
    persistentvolumeclaim/scale-advance-pvc-1 created
    storageclass.storage.k8s.io/ibm-spectrum-scale-csi-advance created
    [root@ocp1-helper Upgradetesting]# cat apply.yaml
    apiVersion: v1
    kind: Pod
    metadata:
    name: csi-scale-fsetdemo-pod-2
    labels:
    app: nginx
    spec:
    containers:
    - name: web-server
     image: docker-na-public.artifactory.swg-devops.com/sys-spectrum-scale-team-test-environment-docker-local/nginx:1.22.0
     volumeMounts:
       - name: mypvc
         mountPath: /usr/share/nginx/html/scale
     ports:
     - containerPort: 80
    volumes:
    - name: mypvc
     persistentVolumeClaim:
       claimName: scale-advance-pvc-1
       readOnly: false

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: scale-advance-pvc-1 spec: accessModes:


apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: ibm-spectrum-scale-csi-advance provisioner: spectrumscale.csi.ibm.com parameters: volBackendFs: "fs1" version: "2" reclaimPolicy: Delete [root@ocp1-helper Upgradetesting]# oc get pvc -w NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE scale-advance-pvc-1 Pending ibm-spectrum-scale-csi-advance 46s scale-advance-pvc-1 Pending pvc-ec50c949-9c14-405b-b92d-fe70cdb13210 0 ibm-spectrum-scale-csi-advance 65s scale-advance-pvc-1 Bound pvc-ec50c949-9c14-405b-b92d-fe70cdb13210 1Gi RWX ibm-spectrum-scale-csi-advance 65s [root@ocp1-helper Upgradetesting]# oc get pods -w NAME READY STATUS RESTARTS AGE csi-scale-fsetdemo-pod-2 1/1 Running 0 88s

3. Take snapshot of above pvc :

[root@ocp1-helper Upgradetesting]# cat snapshot.yaml apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: name: ibm-spectrum-scale-snapshot spec: volumeSnapshotClassName: ibm-spectrum-scale-snapshotclass-advance source: persistentVolumeClaimName: scale-advance-pvc-1

apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotClass metadata: name: ibm-spectrum-scale-snapshotclass-advance driver: spectrumscale.csi.ibm.com parameters: snapWindow: "30" #Optional : Time in minutes (default=30) deletionPolicy: Delete [root@ocp1-helper Upgradetesting]# oc apply -f snapshot.yaml volumesnapshot.snapshot.storage.k8s.io/ibm-spectrum-scale-snapshot created volumesnapshotclass.snapshot.storage.k8s.io/ibm-spectrum-scale-snapshotclass-advance unchanged [root@ocp1-helper Upgradetesting]# oc get vs NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE ibm-spectrum-scale-snapshot true scale-advance-pvc-1 1Gi ibm-spectrum-scale-snapshotclass-advance snapcontent-e6474bc9-24b9-4060-856c-0624cd13ef86 26s 55s


4. Create shallow copy volume and pod from snapshot :

[root@ocp1-helper Upgradetesting]# cat shallowcopy.yaml apiVersion: v1 kind: Pod metadata: name: csi-scale-fsetdemo-pod-snapshot labels: app: nginx spec: containers:

Expected behavior

Shallow copy pod should be in running state on ocp with rhel worker node setup

Data Collection and Debugging

CSI Snap

/scale-csi/D.1127
csisnap.tar.gz
saurabhwani5 commented 4 months ago

RHEL Worker node :

[root@worker3 ~]# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   permissive
Mode from config file:          permissive
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      33
hemalathagajendran commented 4 months ago

This is the expected behaviour when selinux is enabled on the cluster as the snapshot path is Read-Only. This is mentioned in the design doc and it is working as expected. This would be documented in KC as well.