PV stays hanging in released state

AcidAngel21 commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST?: /kind bug

What happened: I deploy a stateful set with 3 replicas and 3 PVCs (via storageclass). When I delete the statefulset and immediately delete the PVCs, most of the PVs stays hanging in status Released. When I wait a some seconds before I delete the PVCs, this problem does not occur. This problem does also not happen with csi-driver 1.0.2. In vCenter I constantly see the error "The operation is not allowed in the current state". It seems that the driver tries to delete the storage object before it has been detached from the node.

A workaound to remove the hanging PVs is to remove the PV finalizers: kubectl patch pv pvc-*** -p '{"metadata":{"finalizers":null}}'

What you expected to happen: PVs do not hang in the Released status and are removed.

How to reproduce it (as minimally and precisely as possible): Deploy a stateful set with 3 replicas and 3 PVCs (via storageclass). Delete the statefulset and immediately delete the PVCs.

Anything else we need to know?: csi-attacher logs

I0505 12:15:09.079852 1 csi_handler.go:428] Saving detach error to "csi-50820c7e61c23397a1ef7a282e237cae50eeebc6f416cf22fb04a951d21398e2"
I0505 12:15:09.089540 1 csi_handler.go:439] Saved detach error to "csi-50820c7e61c23397a1ef7a282e237cae50eeebc6f416cf22fb04a951d21398e2"
I0505 12:15:09.089591 1 csi_handler.go:99] Error processing "csi-50820c7e61c23397a1ef7a282e237cae50eeebc6f416cf22fb04a951d21398e2": failed to detach: rpc error: code = Aborted desc = pending
I0505 12:15:09.089942 1 controller.go:141] Ignoring VolumeAttachment "csi-50820c7e61c23397a1ef7a282e237cae50eeebc6f416cf22fb04a951d21398e2" change
I0505 12:15:14.026312 1 csi_handler.go:428] Saving detach error to "csi-d2da7a09f4310e2889cdd422ac8e6a8e3cb30f1eb93c1a35d023872acdeb1a9c"
I0505 12:15:14.033809 1 controller.go:141] Ignoring VolumeAttachment "csi-d2da7a09f4310e2889cdd422ac8e6a8e3cb30f1eb93c1a35d023872acdeb1a9c" change
I0505 12:15:14.035294 1 csi_handler.go:439] Saved detach error to "csi-d2da7a09f4310e2889cdd422ac8e6a8e3cb30f1eb93c1a35d023872acdeb1a9c"
I0505 12:15:14.035351 1 csi_handler.go:99] Error processing "csi-d2da7a09f4310e2889cdd422ac8e6a8e3cb30f1eb93c1a35d023872acdeb1a9c": failed to detach: rpc error: code = Aborted desc = pending

csi-controller logs

{"level":"error","time":"2020-05-05T12:15:06.865649379Z","caller":"volume/manager.go:433","msg":"failed to delete volume: \"8290b052-db75-4e06-a181-5ebcf4bdea4c\", fault: \"(*types.LocalizedMethodFault)(0xc000b3e2c0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c157f\"","TraceId":"910d62cb-2091-4e5f-ab36-d9e40f6b2349","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/common/cns-lib/volume.(*defaultManager).DeleteVolume\n\t/build/pkg/common/cns-lib/volume/manager.go:433\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.DeleteVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:349\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:449\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/
rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:722"}
{"level":"error","time":"2020-05-05T12:15:06.865869018Z","caller":"common/vsphereutil.go:351","msg":"failed to delete disk 8290b052-db75-4e06-a181-5ebcf4bdea4c with error failed to delete volume: \"8290b052-db75-4e06-a181-5ebcf4bdea4c\", fault: \"(*types.LocalizedMethodFault)(0xc000b3e2c0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c157f\"","TraceId":"910d62cb-2091-4e5f-ab36-d9e40f6b2349","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.DeleteVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:351\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:449\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:218
\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:722"}
{"level":"error","time":"2020-05-05T12:15:06.865947275Z","caller":"vanilla/controller.go:452","msg":"failed to delete volume: \"8290b052-db75-4e06-a181-5ebcf4bdea4c\". Error: failed to delete volume: \"8290b052-db75-4e06-a181-5ebcf4bdea4c\", fault: \"(*types.LocalizedMethodFault)(0xc000b3e2c0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c157f\"","TraceId":"910d62cb-2091-4e5f-ab36-d9e40f6b2349","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:452\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.
2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:722"}
{"level":"error","time":"2020-05-05T12:15:07.529234683Z","caller":"volume/manager.go:433","msg":"failed to delete volume: \"38f57ff1-aa92-40dc-8e8a-371624fce197\", fault: \"(*types.LocalizedMethodFault)(0xc000b3fea0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c1580\"","TraceId":"22936af7-e73c-43c2-839c-823d26f844f8","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/common/cns-lib/volume.(*defaultManager).DeleteVolume\n\t/build/pkg/common/cns-lib/volume/manager.go:433\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.DeleteVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:349\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:449\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/
rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:722"}
{"level":"error","time":"2020-05-05T12:15:07.529710938Z","caller":"common/vsphereutil.go:351","msg":"failed to delete disk 38f57ff1-aa92-40dc-8e8a-371624fce197 with error failed to delete volume: \"38f57ff1-aa92-40dc-8e8a-371624fce197\", fault: \"(*types.LocalizedMethodFault)(0xc000b3fea0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c1580\"","TraceId":"22936af7-e73c-43c2-839c-823d26f844f8","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.DeleteVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:351\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:449\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:218
\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:722"}
{"level":"error","time":"2020-05-05T12:15:07.529904148Z","caller":"vanilla/controller.go:452","msg":"failed to delete volume: \"38f57ff1-aa92-40dc-8e8a-371624fce197\". Error: failed to delete volume: \"38f57ff1-aa92-40dc-8e8a-371624fce197\", fault: \"(*types.LocalizedMethodFault)(0xc000b3fea0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c1580\"","TraceId":"22936af7-e73c-43c2-839c-823d26f844f8","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:452\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1
.2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:722"}

vshpere-syncer logs

{"level":"error","time":"2020-05-05T12:15:07.274001589Z","caller":"syncer/util.go:43","msg":"Error getting Persistent Volume Claim for volume www with err: persistentvolumeclaim \"www-nginx-topology-0\" not found","TraceId":"5afee8cc-c778-40ab-84fb-6701a1969341","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/syncer.IsValidVolume\n\t/build/pkg/syncer/util.go:43\nsigs.k8s.io/vsphere-csi-driver/pkg/syncer.csiUpdatePod\n\t/build/pkg/syncer/metadatasyncer.go:772\nsigs.k8s.io/vsphere-csi-driver/pkg/syncer.updatePodMetadata\n\t/build/pkg/syncer/metadatasyncer.go:511\nsigs.k8s.io/vsphere-csi-driver/pkg/syncer.podDeleted\n\t/build/pkg/syncer/metadatasyncer.go:503\nsigs.k8s.io/vsphere-csi-driver/pkg/syncer.InitMetadataSyncer.func7\n\t/build/pkg/syncer/metadatasyncer.go:191\nk8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnDelete\n\t/go/pkg/mod/k8s.io/client-go@v11.0.1-0.20191029005444-8e4128053008+incompatible/tools/cache/controller.go:209\nk8s.io/client-go/tools/cache.(*processorListener).run.func1.1\n\t/go/pkg/mod/k8s.io/client-go@v11.0.1-0.20191029005444-8e4128053008+incompatible/tools/cache/shared_informer.go:556\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004074956-c5d2f014d689/pkg/util/wait/wait.go:265\nk8s.io/client-go/tools/cache.(*processorListener).run.func1\n\t/go/pkg/mod/k8s.io/client-go@v11.0.1-0.20191029005444-8e4128053008+incompatible/tools/cache/shared_informer.go:548\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004074956-c5d2f014d689/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004074956-c5d2f014d689/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004074956-c5d2f014d689/pkg/util/wait/wait.go:88\nk8s.io/client-go/tools/cache.(*processorListener).run\n\t/go/pkg/mod/k8s.io/client-go@v11.0.1-0.20191029005444-8e4128053008+
incompatible/tools/cache/shared_informer.go:546\nk8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004074956-c5d2f014d689/pkg/util/wait/wait.go:71"}

csi-provisioner logs

E0505 12:15:06.866375       1 controller.go:1334] delete "pvc-70aa8526-bca6-4f6d-835e-1073eaddb01f": volume deletion failed: rpc error: code = Internal desc = failed to delete volume: "8290b052-db75-4e06-a181-5ebcf4bdea4c". Error: failed to delete volume: "8290b052-db75-4e06-a181-5ebcf4bdea4c", fault: "(*types.LocalizedMethodFault)(0xc000b3e2c0)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n  BaseMethodFault: (types.BaseMethodFault) ,\n  Reason: (string) (len=63) \"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n },\n LocalizedMessage: (string) (len=79) \"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n})\n", opID: "076c157f"
E0505 12:15:07.530972       1 controller.go:1334] delete "pvc-f2ed636e-3190-4368-8c10-d7256b5700aa": volume deletion failed: rpc error: code = Internal desc = failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197". Error: failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197", fault: "(*types.LocalizedMethodFault)(0xc000b3fea0)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n  BaseMethodFault: (types.BaseMethodFault) ,\n  Reason: (string) (len=63) \"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n },\n LocalizedMessage: (string) (len=79) \"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n})\n", opID: "076c1580"
W0505 12:15:07.531196       1 controller.go:936] Retrying syncing volume "pvc-f2ed636e-3190-4368-8c10-d7256b5700aa", failure 0
E0505 12:15:07.531316       1 controller.go:954] error syncing volume "pvc-f2ed636e-3190-4368-8c10-d7256b5700aa": rpc error: code = Internal desc = failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197". Error: failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197", fault: "(*types.LocalizedMethodFault)(0xc000b3fea0)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n  BaseMethodFault: (types.BaseMethodFault) ,\n  Reason: (string) (len=63) \"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n },\n LocalizedMessage: (string) (len=79) \"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n})\n", opID: "076c1580"
I0505 12:15:07.531433       1 event.go:255] Event(v1.ObjectReference{Kind:"PersistentVolume", Namespace:"", Name:"pvc-f2ed636e-3190-4368-8c10-d7256b5700aa", UID:"73287ac1-5db7-4551-a3a4-37fdcb6498b1", APIVersion:"v1", ResourceVersion:"61687007", FieldPath:""}): type: 'Warning' reason: 'VolumeFailedDelete' rpc error: code = Internal desc = failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197". Error: failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197", fault: "(*types.LocalizedMethodFault)(0xc000b3fea0)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n  BaseMethodFault: (types.BaseMethodFault) ,\n  Reason: (string) (len=63) \"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n },\n LocalizedMessage: (string) (len=79) \"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n})\n", opID: "076c1580"

Environment:

csi-vsphere version: 2.0.0
vsphere-cloud-controller-manager version: 1.1.0
Kubernetes version: 1.14.6
vSphere version: 6.7u3
OS (e.g. from /etc/os-release): RancherOS v1.5.1
Kernel (e.g. uname -a): 4.14.85-rancher
Install tools: rke
Others: I used this manifests to deploy the driver https://github.com/kubernetes-sigs/vsphere-csi-driver/tree/master/manifests/v2.0.0/vsphere-67u3/vanilla

brentwolfram commented 4 years ago

I am having a similar issue moving from version 1.0.0 of the CSI driver to version 2.0.0. I can create PVs, but cannot delete them the majority of the time (about 20% of the time it works). They stay in the release state.

Logs:

csi-attacher:

I0506 12:16:26.400501 1 controller.go:175] Started VA processing "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e" I0506 12:16:26.400557 1 csi_handler.go:89] CSIHandler: processing VA "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e" I0506 12:16:26.400572 1 csi_handler.go:140] Starting detach operation for "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e" I0506 12:16:26.400669 1 csi_handler.go:147] Detaching "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e" I0506 12:16:26.400704 1 csi_handler.go:542] Found NodeID wuatk8sworker0 in CSINode wuatk8sworker0 I0506 12:16:26.470613 1 csi_handler.go:428] Saving detach error to "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e" I0506 12:16:26.479926 1 controller.go:141] Ignoring VolumeAttachment "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e" change I0506 12:16:26.480359 1 csi_handler.go:439] Saved detach error to "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e" I0506 12:16:26.480399 1 csi_handler.go:99] Error processing "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e": failed to detach: rpc error: code = Internal desc = volumeID "276ae09e-96a0-4236-a053-7dbea3997318" not found in QueryVolume

csi-controller:

{"level":"error","time":"2020-05-06T12:16:32.724480563Z","caller":"common/vsphereutil.go:351","msg":"failed to delete disk 276ae09e-96a0-4236-a053-7dbea3997318 with error failed to delete volume: \"276ae09e-96a0-4236-a053-7dbea3997318\", fault: \"(types.LocalizedMethodFault)(0xc000614a80)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n BaseMethodFault: (types.BaseMethodFault) ,\n Reason: (string) (len=63) \\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\\"\n },\n LocalizedMessage: (string) (len=79) \\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\\"\n})\n\", opID: \"0655df75\"","TraceId":"e559b2ef-d09f-4e42-a0da-075412f4233d","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.DeleteVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:351\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:449\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:218 \ngithub.com/rexray/gocsi/middleware/specvalidator.(interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1024\ngoogle.golang.org/grpc.(Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:722"}

{"level":"error","time":"2020-05-06T12:16:32.724652761Z","caller":"vanilla/controller.go:452","msg":"failed to delete volume: \"276ae09e-96a0-4236-a053-7dbea3997318\". Error: failed to delete volume: \"276ae09e-96a0-4236-a053-7dbea3997318\", fault: \"(types.LocalizedMethodFault)(0xc000614a80)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n BaseMethodFault: (types.BaseMethodFault) ,\n Reason: (string) (len=63) \\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\\"\n },\n LocalizedMessage: (string) (len=79) \\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\\"\n})\n\", opID: \"0655df75\"","TraceId":"e559b2ef-d09f-4e42-a0da-075412f4233d","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:452\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(interceptor).handle\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1 .2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.2.1/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.2.0/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1024\ngoogle.golang.org/grpc.(Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:722"}

In vCenter I get these two events repeating after I try to delete the volume:

Delete container volume (Completed) Delete a virtual storage object (Failed - The operation is not allowed in the current state)

(even with version 1.0.2 of the driver, I sometimes get the above message, but the PV is eventually released and datastore is cleaned up)

To rule out permissions issue, I tried using credentials with global admin, but same error occurs.

Upon reverting back to version 1.0.0 or 1.0.2 of the driver (with the proper restrictive permissions), I can add/remove volumes normally with consistency.

Environment: csi-vsphere version: 2.0.0 vsphere-cloud-controller-manager version: gcr.io/cloud-provider-vsphere/cpi/release/manager:latest Kubernetes version: v1.15.6 vSphere version: 6.7U3 OS (e.g. from /etc/os-release): Ubuntu 18.04.4 LTS (Bionic Beaver) Kernel (e.g. uname -a): 4.15.0-99-generic Install tools: terraform/rancher2 provider

larhauga commented 4 years ago

I can confirm that we are experiencing the same issue. We manage to reproduce the issue by creating a PVC, and a pod related to the claim. By deleting the PVC first, and then the pod, it is often stuck in Released state. Both the PV and the volumeattachment are still there, waiting for finalizers. The disks are deleted in vSphere, and the volume is detached from the node.

yaml to reproduce: pvc:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: vsphere-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

apiVersion: v1
kind: Pod
metadata:
  name: pod
spec:
  volumes:
    - name: task-pv-storage
      persistentVolumeClaim:
        claimName: vsphere-pvc
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: task-pv-storage

kubectl delete pvc vsphere-pvc
kubectl delete pod pod

Environment csi-vsphere version: v2.0.0-rc1 vsphere-cloud-controller-manager version: 1.1.0 Kubernetes version: 1.16.2 vSphere version: 6.7u3 OS (e.g. from /etc/os-release): Red Hat Enterprise Linux CoreOS 43.81.202003310153.0 (Ootpa) Kernel (e.g. uname -a): 4.18.0-147.5.1.el8_1.x86_64 Install tools: Others:

csi deployment images: quay.io/k8scsi/csi-attacher:v2.0.0 gcr.io/cloud-provider-vsphere/csi/release/driver:v2.0.0-rc.1 quay.io/k8scsi/livenessprobe:v1.1.0 gcr.io/cloud-provider-vsphere/csi/release/syncer:v2.0.0-rc.1 quay.io/k8scsi/csi-provisioner:v1.6.0

node daemonset images: quay.io/k8scsi/csi-node-driver-registrar:v1.2.0 gcr.io/cloud-provider-vsphere/csi/release/driver:v2.0.0-rc.1 quay.io/k8scsi/livenessprobe:v1.1.0

divyenpatel commented 4 years ago

Are you hitting this issue 5 mentioned in the documentation? https://vsphere-csi-driver.sigs.k8s.io/known_issues.html#issue_5

AcidAngel21 commented 4 years ago

It sounds like that. But why does it work with the CSI driver 1.0.2?

AcidAngel21 commented 4 years ago

I observed the following behaviour in vCenter. CSI driver 1.0.2 : deletion fails repeatedly while the volume is still attached and after the volume is detached the deletion succeeds. CSI driver 2.0.0: deletion fails repeatedly while the volume is still attached but there is no further try to detach the volume once this happens.

alex1989hu commented 4 years ago

Our cluster is also affected w/v2.0.0 😞

divyenpatel commented 4 years ago

Delete Volume is called before Detach volume operation, Delete Volume operation un-tag volume as Container Volume, and later observes volume is attached to the node VM, and does not tag back volume as Container Volume. Detach Volume comes and attempts to query volume to determine if it is file or block, and since volume is not a container volume, Detach Volume operation does not attempt to detach the volume from node VM.

You are observing in v1.0.2 detach attempts are happening as we do not Query Volume to help determine volume is block or file.

This issue is fixed in vSphere 7.0u1.

@RaunakShah is also helping to mitigate this issue by providing the fix for https://github.com/kubernetes/kubernetes/issues/84226 in the external provisioner.

msau42 commented 4 years ago

Is it possible for the driver/vsphere to check if volume is attached and fail deletion? This is how other cloud providers behave.

nickvth commented 4 years ago

Same issue here with v2.0.0... not funny to detach from 1 of 20 nodes and delete volumes in fcd manually

AcidAngel21 commented 4 years ago

@divyenpatel "This issue is fixed in vSphere 7.0u1" Do you really mean 7.0u1? This version isn't released yet.

divyenpatel commented 4 years ago

Do you really mean 7.0u1? This version isn't released yet.

Yes it is not released yet.

but @RaunakShah has already fixed the race by making a change in the external-provisioner - https://github.com/kubernetes-csi/external-provisioner/pull/438

RaunakShah commented 4 years ago

@AcidAngel21 The fix from external-provisioner is expected to be part of the next release - https://github.com/kubernetes-csi/external-provisioner/commits/v2.0.0-rc2 Once external-provisioner has released this image, we will validate it with our latest CSI driver and will update the YAMLs with the latest images.

Anil-YadavK8s commented 4 years ago

@RaunakShah Can we use v2.0.0-rc2 to get rid of above issue ?

longwa commented 4 years ago

Will this fix be available in 6.7U3 with the 1.0.x version of the driver? We have no plans to upgrade to 7.0 in the near future and not being able to delete PV's will be a problem.

Quantas commented 4 years ago

I am on vSphere 7.0 and was able to test quay.io/k8scsi/csi-provisioner:v2.0.0. I can confirm that I no longer get stuck PVs after deletion.

xander-sh commented 4 years ago

@RaunakShah csi-provisioner already released new version of image (v2.0.1) https://quay.io/repository/k8scsi/csi-provisioner?tag=latest&tab=tags Can you please validate it image and update deploy YAMLs.

RaunakShah commented 4 years ago

@xander-sh we've validated the latest versions of sidecars and updated the YAMLs in the latest folder. I'll get back to you on whether we're doing that for existing releases as well..

longwa commented 4 years ago

We are using the version of CSI that installs by default with TKG on 6.7u3. I'm not sure if we can upgrade for this platform so I believe we are stuck with the bug. Hopefully, TKG 1.2 will come out soon and upgrade to the 2.x CSI driver for the 6.7u3 platform, but I'm not holding my breath on that one.

xander-sh commented 4 years ago

@xander-sh we've validated the latest versions of sidecars and updated the YAMLs in the latest folder. I'll get back to you on whether we're doing that for existing releases as well...

Thanks, we are really looking forward to a fix csi-provisioner in the version 6.7u3 of vSphere.

namgizlat commented 3 years ago

Hi,

is there an update about the fix to version 6.7u3 of vSphere?

RaunakShah commented 3 years ago

vSphere CSI v2.0.1 release is now available - https://github.com/kubernetes-sigs/vsphere-csi-driver/releases/tag/v2.0.1

You will find updated manifests for vSphere 6.7u3 and 7.0 over here - https://github.com/kubernetes-sigs/vsphere-csi-driver/tree/master/manifests/v2.0.1

RaunakShah commented 3 years ago

/close

k8s-ci-robot commented 3 years ago

@RaunakShah: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/217#issuecomment-719739434): >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

kubernetes-sigs / vsphere-csi-driver

PV stays hanging in released state #217