Describe the bug
Harvester is no longer able to mount any PVCs causing all workloads to wait indefinitely.
Harvester CSI throws the following relevant logs:
csi-attacher I0620 17:23:21.202979 1 controller.go:208] Started VA processing "csi-0ba6c8dc618582bf2ec444222752944fa0b6b9139adecbb62d82b69191a4612e"
csi-attacher I0620 17:23:21.202986 1 csi_handler.go:218] CSIHandler: processing VA "csi-0ba6c8dc618582bf2ec444222752944fa0b6b9139adecbb62d82b69191a4612e"
csi-attacher I0620 17:23:21.202989 1 csi_handler.go:269] Starting detach operation for "csi-0ba6c8dc618582bf2ec444222752944fa0b6b9139adecbb62d82b69191a4612e"
csi-attacher I0620 17:23:21.203007 1 csi_handler.go:276] Detaching "csi-0ba6c8dc618582bf2ec444222752944fa0b6b9139adecbb62d82b69191a4612e"
csi-attacher I0620 17:23:21.203041 1 csi_handler.go:742] Found NodeID home-workers-1cb93b3e-k6559 in CSINode home-workers-1cb93b3e-k6559
csi-attacher I0620 17:23:21.203061 1 connection.go:182] GRPC call: /csi.v1.Controller/ControllerUnpublishVolume
csi-attacher I0620 17:23:21.203065 1 connection.go:183] GRPC request: {"node_id":"home-workers-1cb93b3e-k6559","volume_id":"pvc-f969d272-d61d-4091-a1df-f484cc680753"}
csi-attacher I0620 17:23:21.217166 1 csi_handler.go:620] Saved detach error to "csi-0f6065cdc387a9f529c68ab92477210efa2880a7d340fae4858dc40a362adfcd"
csi-attacher I0620 17:23:21.217198 1 csi_handler.go:228] Error processing "csi-0f6065cdc387a9f529c68ab92477210efa2880a7d340fae4858dc40a362adfcd": failed to detach: rpc error: code = Internal desc = Failed to remove volume pvc-97b512ba-0ff4-48e2-8a65-939e0fbcbc72 from node home-workers-1cb93b3e-s89xk: Operation cannot be fulfilled on virtualmachine.kubevirt.io "home-workers-1cb93b3e-s89xk": Unable to remove volume [pvc-97b512ba-0ff4-48e2-8a65-939e0fbcbc72] because it does not exist
To Reproduce
Steps to reproduce the behavior:
Go to harvester and gracefully shutdown all VMs of a child cluster.
Manually remove all PVCs mounted to all VMs
Power on all VMs
Expected behavior
Harvester CSI will see certain volumes are no longer attached to their respective nodes and take the necessary actions to resolve this.
With some testing it looks like manually adding the PVC volumes back to the correct VMs resolves the issue. However I think a non-manual solution should still be implemented.
Describe the bug Harvester is no longer able to mount any PVCs causing all workloads to wait indefinitely. Harvester CSI throws the following relevant logs:
To Reproduce Steps to reproduce the behavior:
Expected behavior Harvester CSI will see certain volumes are no longer attached to their respective nodes and take the necessary actions to resolve this.
Support bundle supportbundle_774ced99-ead3-43ed-b689-7136613f6eb2_2024-06-20T18-59-04Z.zip
Environment
Additional context Add any other context about the problem here.