dell / csm

Dell Container Storage Modules (CSM)
Apache License 2.0
68 stars 15 forks source link

[BUG]: snapshot restore failed with Message = failed to get acl entries: Too many links #1514

Open ybrock opened 3 weeks ago

ybrock commented 3 weeks ago

Bug Description

Hello,

We have CSM modules 1.3.1 with CSI drivers version 1.10.1 installed on Openshift 4.14.35 (K8s 1.27.16).

We have Dell PowerScale (Isilon) configured and running ok except for this issue.

When we try to restore a snapshot from a PVC containing a symlink, the new PVC is never created (pending) and these events are reported in the CSI driver :

failed to provision volume with StorageClass "isilon-infra":
  rpc error: code = Internal desc = failed to copy snapshot id '381307'
error 'Error Source = /ifs/data/infra/.snapshot/snapshot-6cdfa72f-370a-4998-b406-22122315369f/csi/ocx/k8s-c3c0e0478f/backup/db/latest
Message = failed to get acl entries: Too many links
Source = /ifs/data/infra/.snapshot/snapshot-6cdfa72f-370a-4998-b406-22122315369f/csi/ocx/k8s-c3c0e0478f/backup/db/latest
Target = /ifs/data/infra/csi/ocx/k8s-6e4912c502/backup/db/latest '
93m         Warning   ProvisioningFailed       persistentvolumeclaim/vol3                          failed to provision volume with StorageClass "isilon-infra": rpc error: code = Internal desc = failed to copy snapshot id '382689', error 'Error Source = /ifs/data/infra/.snapshot/snapshot-5c50547b-2a8c-4ab6-8275-e79116d6395a/csi/ocx/k8s-25b1d626b2/2,Message = failed to get acl entries: Too many links,,Source = /ifs/data/infra/.snapshot/snapshot-5c50547b-2a8c-4ab6-8275-e79116d6395a/csi/ocx/k8s-25b1d626b2/2,Target = /ifs/data/infra/csi/ocx/k8s-9fde9d56f5/2 ...

The provisioner container is rising this message :

I1009 14:04:25.677248       1 controller.go:1075] Final error received, removing PVC 8bea8d03-1f57-438c-80c1-052b13d7ef6f from claims in progress
W1009 14:04:25.677258       1 controller.go:934] Retrying syncing claim "8bea8d03-1f57-438c-80c1-052b13d7ef6f", failure 138
E1009 14:04:25.677276       1 controller.go:957] error syncing claim "8bea8d03-1f57-438c-80c1-052b13d7ef6f": failed to provision volume with StorageClass "isilon-infra": rpc error: code = Internal desc = failed to copy snapshot id '382107', error 'Error Source = /ifs/data/infra/.snapshot/snapshot-8939b9e6-a7d3-4e19-b89a-2f527f5e866a/csi/ocx/k8s-8edf05408f/backup/db/latest,Message = failed to get acl entries: Too many links,,Source = /ifs/data/infra/.snapshot/snapshot-8939b9e6-a7d3-4e19-b89a-2f527f5e866a/csi/ocx/k8s-8edf05408f/backup/db/latest,Target = /ifs/data/infra/csi/ocx/k8s-8bea8d031f/backup/db/latest 

If there is no symlink in the file system, the snapshot restore works.

Logs

I1009 14:04:12.366227       1 leaderelection.go:281] successfully renewed lease dell-csm/csi-isilon-dellemc-com
I1009 14:04:17.374465       1 leaderelection.go:281] successfully renewed lease dell-csm/csi-isilon-dellemc-com
I1009 14:04:22.390114       1 leaderelection.go:281] successfully renewed lease dell-csm/csi-isilon-dellemc-com
I1009 14:04:25.677181       1 connection.go:251] GRPC response: {}
I1009 14:04:25.677198       1 connection.go:252] GRPC error: rpc error: code = Internal desc = failed to copy snapshot id '382107', error 'Error Source = /ifs/data/infra/.snapshot/snapshot-8939b9e6-a7d3-4e19-b89a-2f527f5e866a/csi/ocx/k8s-8edf05408f/backup/db/latest,Message = failed to get acl entries: Too many links,,Source = /ifs/data/infra/.snapshot/snapshot-8939b9e6-a7d3-4e19-b89a-2f527f5e866a/csi/ocx/k8s-8edf05408f/backup/db/latest,Target = /ifs/data/infra/csi/ocx/k8s-8bea8d031f/backup/db/latest 
'
I1009 14:04:25.677213       1 controller.go:848] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Finished: rpc error: code = Internal desc = failed to copy snapshot id '382107', error 'Error Source = /ifs/data/infra/.snapshot/snapshot-8939b9e6-a7d3-4e19-b89a-2f527f5e866a/csi/ocx/k8s-8edf05408f/backup/db/latest,Message = failed to get acl entries: Too many links,,Source = /ifs/data/infra/.snapshot/snapshot-8939b9e6-a7d3-4e19-b89a-2f527f5e866a/csi/ocx/k8s-8edf05408f/backup/db/latest,Target = /ifs/data/infra/csi/ocx/k8s-8bea8d031f/backup/db/latest 
'
I1009 14:04:25.677248       1 controller.go:1075] Final error received, removing PVC 8bea8d03-1f57-438c-80c1-052b13d7ef6f from claims in progress
W1009 14:04:25.677258       1 controller.go:934] Retrying syncing claim "8bea8d03-1f57-438c-80c1-052b13d7ef6f", failure 138
E1009 14:04:25.677276       1 controller.go:957] error syncing claim "8bea8d03-1f57-438c-80c1-052b13d7ef6f": failed to provision volume with StorageClass "isilon-infra": rpc error: code = Internal desc = failed to copy snapshot id '382107', error 'Error Source = /ifs/data/infra/.snapshot/snapshot-8939b9e6-a7d3-4e19-b89a-2f527f5e866a/csi/ocx/k8s-8edf05408f/backup/db/latest,Message = failed to get acl entries: Too many links,,Source = /ifs/data/infra/.snapshot/snapshot-8939b9e6-a7d3-4e19-b89a-2f527f5e866a/csi/ocx/k8s-8edf05408f/backup/db/latest,Target = /ifs/data/infra/csi/ocx/k8s-8bea8d031f/backup/db/latest 
'
I1009 14:04:25.677370       1 event.go:364] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"evs-crunchy", Name:"danthe", UID:"8bea8d03-1f57-438c-80c1-052b13d7ef6f", APIVersion:"v1", ResourceVersion:"2652820", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "isilon-infra": rpc error: code = Internal desc = failed to copy snapshot id '382107', error 'Error Source = /ifs/data/infra/.snapshot/snapshot-8939b9e6-a7d3-4e19-b89a-2f527f5e866a/csi/ocx/k8s-8edf05408f/backup/db/latest,Message = failed to get acl entries: Too many links,,Source = /ifs/data/infra/.snapshot/snapshot-8939b9e6-a7d3-4e19-b89a-2f527f5e866a/csi/ocx/k8s-8edf05408f/backup/db/latest,Target = /ifs/data/infra/csi/ocx/k8s-8bea8d031f/backup/db/latest 

Screenshots

No response

Additional Environment Information

No response

Steps to Reproduce

create a PVC on a powerscale storageClass mount the PVC in a pod write a file into PVC create a symlink in the PVC pointing to previous file take a snapshot create a new PVC from restoring from previous snapshot

Expected Behavior

new PVC is created from snapshot

CSM Driver(s)

CSI 1.10.1 CSM 1.3.1

Installation Type

helm

Container Storage Modules Enabled

isilon karavi

Container Orchestrator

openshift 4.14 (crio)

Operating System

redhat coreos

csmbot commented 2 weeks ago

@ybrock: Thank you for submitting this issue!

The issue is currently awaiting triage. Please make sure you have given us as much context as possible.

If the maintainers determine this is a relevant issue, they will remove the needs-triage label and respond appropriately.


We want your feedback! If you have any questions or suggestions regarding our contributing process/workflow, please reach out to us at container.storage.modules@dell.com.

satyakonduri commented 17 hours ago

Hi @ybrock We don’t have a release for the 1.10.1 CSI driver and CSM 1.3.1, Could you please confirm the correct CSI driver and CSM versions? Thank you!

ybrock commented 16 hours ago

Hi @satyakonduri

The helm chart used to install Dell CSM is 1.3.1 The CSI driver provided with this is in version v2.10.0 (tag of the image : registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.0 and registry.access.redhat.com/dellemc/csi-isilon:v2.10.0 )

Sorry for the unprecision