Using "oc rollout restart deployment " to restart of the user deployment that consumes PowerFlex and if the new pod is assigned to the same node results in the new pod failing to start, remaining in the ContainerCreating state.
Any operation that causes the kubelet to restart (for example, replacing the cluster certificate) will cause the vxflexos-node pod to receive NodePublishVolume events for existing PVCs, leading to continuous warning events in the user pod (though the user pod remains in a running state).
Logs
Warning FailedMount 40s (x598 over 20h) kubelet MountVolume.SetUp failed for volume "vrtestraven-7169900ccd" : rpc error: code = Internal desc = Device already in use and mounted elsewhere for privTgt /var/lib/kubelet/plugins/vxflexos.emc.dell.com/disks/1dabfae970da540f-b83f174c000002a0
Screenshots
No response
Additional Environment Information
Problem is not related to a particular platform.
Steps to Reproduce
Create a deployment that consumes PowerFlex through PowerFlex CSI.
Restart the vxflexos-node pod on the same node as the deployment.
Enter the newly restarted vxflexos-node pod. Run the mount command and observe that the privTgt is lost, with only the target mount displayed.
Expected Behavior
Expect that the node pods should be restarted without error.
Bug Description
We have observed two issues when privTgt is lost:
Using "oc rollout restart deployment" to restart of the user deployment that consumes PowerFlex and if the new pod is assigned to the same node results in the new pod failing to start, remaining in the ContainerCreating state.
Any operation that causes the kubelet to restart (for example, replacing the cluster certificate) will cause the vxflexos-node pod to receive NodePublishVolume events for existing PVCs, leading to continuous warning events in the user pod (though the user pod remains in a running state).
Logs
Warning FailedMount 40s (x598 over 20h) kubelet MountVolume.SetUp failed for volume "vrtestraven-7169900ccd" : rpc error: code = Internal desc = Device already in use and mounted elsewhere for privTgt /var/lib/kubelet/plugins/vxflexos.emc.dell.com/disks/1dabfae970da540f-b83f174c000002a0
Screenshots
No response
Additional Environment Information
Problem is not related to a particular platform.
Steps to Reproduce
Create a deployment that consumes PowerFlex through PowerFlex CSI. Restart the vxflexos-node pod on the same node as the deployment. Enter the newly restarted vxflexos-node pod. Run the mount command and observe that the privTgt is lost, with only the target mount displayed.
Expected Behavior
Expect that the node pods should be restarted without error.
CSM Driver(s)
csi-powerflex 1.11
Installation Type
No response
Container Storage Modules Enabled
No response
Container Orchestrator
OCP: 4.16.6
Operating System
RCOS