Closed Taronyuu closed 9 months ago
Hi, sorry but this sounds like an issue of your mysql deployment. Kind of sure this is outside of the scope of this project.
I don't think this is a MySQL issue, the question about volumes is platform agnostic.
However, I can understand if this is not supported or recommended, if that is the case then I would like to hear it :)
Hey, what I mean Hetzner Volume (hetzner-csi) does not support multi attach.
from https://github.com/hetznercloud/csi-driver
This is a [Container Storage Interface](https://github.com/container-storage-interface/spec) driver for Hetzner Cloud enabling you to use ReadWriteOnce Volumes within
So it is not a problem this project can fix.
I have no idea why the old mysql pod does not release the PV(c) when crashed/shutdown.
We have the hetzner volume in an other k8s cluster running in production, and they are running fine, e.g. delete pod, it gets recreated and starts up without problems.
Since you did not share more details like the mysql manifest yml, I can only speculate... maybe you do not have set valid resource limits for the mysql/Neo4J deployment.
So the memory spike kills the node, when the mysql pod should be killed in this case and this would release the PVC and restart should work.
bye
Folks, just use the hcloud cli to detatch the volume the. reattach it to the correct node.
Description
I am currently debugging an issue where one node in my four-node cluster goes offline. I've narrowed the problem down to Neo4J, which is causing a memory spike. When this happens, the node grinds to a complete halt: SSH and Netdata stop working, and Kubernetes marks it as NoSchedule and NoExecute. While the node should ideally not become unavailable in the first place, this part of the system is working as expected.
The next step is that K3S attempts to terminate the previous pod and redeploy it on a different node. This is exactly the desired 'self-healing' behavior. However, although I can see that the new pod is created, it never starts on top of Hetzner Volumes due to a multi-attach error:
The old pod is marked as Terminating
As far as I know, Hetzner Volumes cannot be attached to multiple volumes. However, is there a way to free the volume from the previous terminating pod? If not, is there a reason why this is not done? Additionally, is there a way to still allow the cluster to reschedule pods and 'heal' itself when a node goes offline?