Open rusLukasRath opened 2 months ago
We also are starting to see this. We recently changed the SVM name in our TridentBackendConfig and things were running ok. We then upgraded to Openshift 4.16.18 and as it restarted pods, several are encountering the same context deadline exceeded
message and won't mount.
Not sure if it is related to the 4.16 upgrade, the fact that we updated the Backend, or unrelated entirely Trident v24.06.1 via Helm Chart
Describe the bug Trident PVCs could be mounted as normal on the worker node, but after some time or because of some unknown reason, Trident PVCs stop being able to be mounted on this exact node. Pods that are trying to mount a Trident PVC get the error message: "context deadline exceeded"
The exact same PVC can still be mounted on other worker nodes. This issue happens with all Trident PVCs, old and newly created after the issue started. Restarting the trident-node pod on said worker node does not fix the issue.
Trying to mount the NetApp shares manually on said node works completly fine.
Environment
Provide accurate information about the environment to help us reproduce the issue.
To Reproduce
Unknown
Expected behavior
Trident PVCs should be able to be mounted at all times.
Additional context
The cluster on which this problem occures is running all of our GitLab Runner build jobs. On this cluster dozens of build jobs are running simultaneously and multiple build jobs are starting at the same time that want to mount the same Trident PVCs.
Attached is the log of the trident-node pod on the node before we terminated and started a new one. trident-node-linux-t5f54.txt