kubernetes-sigs / vsphere-csi-driver

vSphere storage Container Storage Interface (CSI) plugin
https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/index.html
Apache License 2.0
289 stars 173 forks source link

restoration of a volume snapshot to create a new PVC is stuck in a Pending state. #2883

Closed pbs-jyu closed 2 months ago

pbs-jyu commented 2 months ago

Is this a BUG REPORT or FEATURE REQUEST?: BUG REPORT

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened: User I'm trying to deploy a postgresql with 3 replica and use one of its persistent volumes to do a snapshot, and use the snapshot create a new PVC, and created a new pod with this new PVC. The new PVC shows as pending.

Events: Type Reason Age From Message


Normal Provisioning 2m39s (x307 over 18h) csi.vsphere.vmware.com_vsphere-csi-controller-7b84fbfb87-9ss2d_88c7ed5f-e04c-4489-acf8-0a58fea7026a External provisioner is provisioning volume for claim "devops/postgres-storage-postgres-0-restore" Normal ExternalProvisioning 114s (x4483 over 18h) persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'csi.vsphere.vmware.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.

Reference: https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/3.0/vmware-vsphere-csp-getting-started/GUID-E0B41C69-7EEB-450F-A73D-5FD2FF39E891.html

What you expected to happen: After I restore the volume snapshot to a new PVC, It should show as Bound.

How to reproduce it (as minimally and precisely as possible): Even I deleted the PVC which was created from the volume snapshot, deleted the vsphere-csi-controller and vsphere-csi-node; make sure new csi controller and nodes are recreated; recreate the pvc with volume snapshot, still shows Pending.

Anything else we need to know?:

Environment:

pbs-jyu commented 2 months ago

vsphere-csi-controller-7b84fbfb87-9ss2d logs shows:

I0509 14:38:29.794744 1 event.go:364] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"devops", Name:"postgres-storage-postgres-0-restore", UID:"0f6ab8bf-2901-41d1-b751-15a82c3b4de7", APIVersion:"v1", ResourceVersion:"59346673", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "devops/postgres-storage-postgres-0-restore" I0509 14:38:29.805945 1 controller.go:1075] Final error received, removing PVC 0f6ab8bf-2901-41d1-b751-15a82c3b4de7 from claims in progress W0509 14:38:29.805963 1 controller.go:934] Retrying syncing claim "0f6ab8bf-2901-41d1-b751-15a82c3b4de7", failure 311 E0509 14:38:29.806020 1 controller.go:957] error syncing claim "0f6ab8bf-2901-41d1-b751-15a82c3b4de7": failed to provision volume with StorageClass "prm-esxi-eksa-1-sc": rpc error: code = Internal desc = CNS query volume failed to find the volume: "35b01641-4e54-44d9-b297-3fb83102a02b" I0509 14:38:29.806077 1 event.go:364] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"devops", Name:"postgres-storage-postgres-0-restore", UID:"0f6ab8bf-2901-41d1-b751-15a82c3b4de7", APIVersion:"v1", ResourceVersion:"59346673", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "prm-esxi-eksa-1-sc": rpc error: code = Internal desc = CNS query volume failed to find the volume: "35b01641-4e54-44d9-b297-3fb83102a02b"

error: container 35b01641-4e54-44d9-b297-3fb83102a02b is not valid for pod vsphere-csi-controller-7b84fbfb87-9ss2d

xing-yang commented 2 months ago

What Kubernetes distro is this?

Regarding `use the snapshot create a new PVC, and created a new pod with this new PVC.', can you show the yaml of the pod and PVC? Do you mean it is a standalone pod with one Pod and one PVC? Can you show details of the StorageClass? Can you show details of the VolumeSnapshot and VolumeSnapshotContent?

Can you provide full logs, including vSphere CSI controller, vSphere CSI Syncer, csi-provisioner, csi-attacher?

Since it says CNS query volume failed to find the volume: "35b01641-4e54-44d9-b297-3fb83102a02b", please also provide VC bundle.

pbs-jyu commented 2 months ago

Xing-Yang, I have fixed the problem by delete all the pods, pvc, volumesnapshot etc. start all over. Thanks a lot for your response.

pbs-jyu commented 2 months ago

problem fixed.