NetApp / trident

Storage orchestrator for containers
Apache License 2.0
755 stars 219 forks source link

Issues related to creating VolumeSnapshots and VolumeSnapshotContents. #888

Closed davidmkrtchian closed 1 month ago

davidmkrtchian commented 7 months ago

Describe the bug

In my cluster configuration, I am using Trident version 23.01, and I encountered issues specifically related to creating VolumeSnapshots and VolumeSnapshotContents. I employ Velero for backing up my volumes, and during the backup process, I encountered errors. I have opened an issue to address this problem, and I'll share the details with you. The latest response from a Velero developer suggests that if Velero successfully creates the VolumeSnapshot, the issue might not be with Velero itself.

https://github.com/vmware-tanzu/velero/issues/7422

Let me explain the expected workflow. Upon configuring a backup with Velero, it utilizes the VolumeSnapshotClass that I created, utilizing the CSI driver csi.trident.netapp.io. Velero then searches for the corresponding PVC, which was created by Trident with the storageClassName: netapp-nfs3-aggr01 created by Trident , using the same driver. Subsequently, the backup process attempts to create the VolumeSnapshot.

However, in this part of the configuration, I encounter a timeout error. The issue arises because the process is awaiting the creation of VolumeSnapshotContent, which unfortunately does not occur.

If you could provide assistance in resolving this problem, I would greatly appreciate it.

Additional context StorageClass configuration

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: netapp-nfs3-aggr01
  uid: d3c895fe-7ebf-4cfd-a89e-6f32430a4cca
  resourceVersion: '594498'
  creationTimestamp: '2022-11-03T12:17:09Z'
  annotations:
    storageclass.kubernetes.io/is-default-class: 'true'
  managedFields:
    - manager: HashiCorp
      operation: Update
      apiVersion: storage.k8s.io/v1
      time: '2022-11-03T12:17:09Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:allowVolumeExpansion: {}
        f:metadata:
          f:annotations:
            .: {}
            f:storageclass.kubernetes.io/is-default-class: {}
        f:parameters:
          .: {}
          f:backendType: {}
          f:fsType: {}
          f:media: {}
          f:provisioningType: {}
          f:snapshots: {}
          f:storagePools: {}
        f:provisioner: {}
        f:reclaimPolicy: {}
        f:volumeBindingMode: {}
  selfLink: /apis/storage.k8s.io/v1/storageclasses/netapp-nfs3-aggr01
provisioner: csi.trident.netapp.io
parameters:
  backendType: ontap-nas
  fsType: xfs
  media: ssd
  provisioningType: thin
  snapshots: 'true'
  storagePools: netapp-nfs3-aggr01:aggr01_n01_SSD
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: Immediate
bert-jan commented 6 months ago

Are you able to create the volumesnapshot manually via k8s and not using velero? We're using a similar setup and it works (for the most part) fine with trident 23.10.

VolumeSnapshotClass: apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotClass metadata: name: trident-vsclass driver: csi.trident.netapp.io deletionPolicy: Delete

Are the pods running? kubectl get pods -n kube-system | grep snapshot-controller

Snapshot resource: apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: name: test-csi-volume-snapshot namespace: your-namespace spec: volumeSnapshotClassName: trident-vsclass #change it with your volume snapshot class name source: persistentVolumeClaimName: target-pvc-name # change it with target pvc name

Check for snapshots kubectl get volumesnapshot # or vs kubectl get volumesnapshotcontent # or vsc

praveene12 commented 6 months ago

@davidmkrtchian Please confirm if you are able to create volumesnapshot manually via k8s and not using valero. Could you share the status of trident pods ? kubectl get pods -n trident

Could you share the status -

kubectl get pvc
kubectl get volumesnapshot 
kubectl get volumesnapshotcontent

The readytouse flag for volumesnapshot must be great.

Could you share the definitions used for backend, pvc, volumesnapshot and volumesnapshotclass.

This should help us investigate the issue better and try to reproduce the issue in our lab.

vasum0406 commented 1 month ago

@davidmkrtchian do appreciate if you respond to this thread for Trident team to investigate further.