Open kvaps opened 4 years ago
I just point this issue with the related slack thread https://linbit-community.slack.com/archives/CPDJCHW2X/p1602491983047900
@kvaps Hi Any updates on this problem I'm not able to restart my workloads sometimes :)
Diskless/TieBreaker resource will be unavailable and problem solves after deleting that
Hi, we're using latest STORK plugin from the upstream, by default it is coming with health-monitor enabled:
And today we faced with the painful issue. We have many nodes, sometimes some of them might be overloaded, they are flapping between Online and OFFLINE state.
STORK detects these nodes and trying to reattach the volumes and restart the pods on place, example log message:
This causes really weird behavior from the linstor-csi driver:
The volume might stuck on DELETING:
The csi-attacher logs says:
After a while the diskless resource will be removed from the node:
But volumeattachment will continue existing on the node
However it will not allow pod to start, because drbd device is missing, the one of possible way to fix it, is to create resource manually, to satisfy existing volumeattachment
I guess this is exact case mentioned by @rck in https://github.com/piraeusdatastore/linstor-csi/issues/52#issuecomment-584562464