Closed sontivr closed 1 month ago
The PVs stuck in Terminating are probably the dependency that keeps the TridentBackend from deleting. Can you do a kubectl describe on one of them to see if there is anything helpful on why they are stuck?
Besides that, Trident support multiple backends in parallel. So even the current one still is in Deleting state, you can just add a new backend (or multiple, if you like).
Thanks for looking into it @wonderland. I don't see anything popping out from the describe output. I did notice that creating another backend with a different name does work. It is just that leaving some objects in hung state makes me nervous about the health of the system.
k describe pv pvc-d2ea4c54-e23d-4a95-b35e-68fd85989937
Name: pvc-d2ea4c54-e23d-4a95-b35e-68fd85989937
Labels: <none>
Annotations: pv.kubernetes.io/provisioned-by: csi.trident.netapp.io
volume.kubernetes.io/provisioner-deletion-secret-name:
volume.kubernetes.io/provisioner-deletion-secret-namespace:
Finalizers: [external-attacher/csi-trident-netapp-io]
StorageClass: fsx-basic-block
Status: Terminating (lasts 3d5h)
Claim: observability/vmstorage-volume-victoria-metrics-cluster-vmstorage-0
Reclaim Policy: Delete
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 50Gi
Node Affinity: <none>
Message:
Source:
Type: CSI (a Container Storage Interface (CSI) volume source)
Driver: csi.trident.netapp.io
FSType: ext4
VolumeHandle: pvc-d2ea4c54-e23d-4a95-b35e-68fd85989937
ReadOnly: false
VolumeAttributes: backendUUID=949563cb-6717-4455-a778-7fb16c906630
internalName=trident_pvc_d2ea4c54_e23d_4a95_b35e_68fd85989937
name=pvc-d2ea4c54-e23d-4a95-b35e-68fd85989937
protocol=block
storage.kubernetes.io/csiProvisionerIdentity=1701633261584-3547-csi.trident.netapp.io
Events: <none>
Every kubernetes PV has an associated "tvol" custom kubernetes resource created in 'trident' namespace.
"oc describe tvol ..."
could give you hints.
Another location to look at it is the trident-controller logs.
That finalizer Finalizers: [external-attacher/csi-trident-netapp-io] would be what's holding it up.
@sontivr it looks like for some reason a PV was stranded, and that is holding you back from deleting the TBC. As @wonderland mentioned, you could just go ahead and create a new TBC to get around this. To clean the old TBC up, you will need to remove the finalizer (@jamessevener, thank you :)). Before you do that, please make sure that this PV is not associated with a PVC and is not being used by a workload. That should help you resolve your issue.
Describe the bug A clear and concise description of what the bug is.
Hello,
I am trying to test FSxONTAP filesystem with iSCSI protocol for persistent volumes to deploy Victoria Metrics time series database into EKS cluster. I am following Run containerized applications efficiently using Amazon FSx for NetApp ONTAP and Amazon EKS with some support from AWS. At some point, I tried to delete TridentBackupConfig to start all over again. It seems to get stuck in Deleting phase forever. The documentation does say that it stays in Deleting phase when it has dependent objects. I have uninstalled the workload and tried to delete PVs/PVCs created using this tbc, but it didn’t help. PVCs got deleted and the PVs got stuck in the Terminating state. What else is included in the backend components? Should I be deleting FSxONTAP filesystem itself for me to be able to clean up the tbc? What if I can’t afford to lose my persistent volumes? Is FSxONTAP+iSCSI recommended for the workloads like Victoria Metrics database deployed into the EKS clusters?
Environment Provide accurate information about the environment to help us reproduce the issue.
To Reproduce kubectl delete tbc backend-fsx-ontap-san
Expected behavior A clear and concise description of what you expected to happen. backend-fsx-ontap-san should. be deleted
Additional context Add any other context about the problem here.