Closed sarge closed 6 years ago
Sorry, I have never seen this problem. So I'm afraid I cannot help with this :-(. In the past I saw lot of strange issues when the disk space run out. But I'm not sure if this is your case.
Thanks Jakub, I appreciate the quick response.
Just for the record - got the same issue with managed k8s on GCP and also because of high use of CronJobs, I guess the only solution is to get Systemd of version 237 which has a fix
I encountered such problem, reboot your system is the fast way to fix this case
same case here on Azure AKS - restart of all nodes helped
Same problem on GKE with COS nodes (systemd 232). Switching to Ubuntu nodes with systemd 237 to see if it solves the problems.
Hi there,
I understand this is not a bug with your implementation.
We are seeing an issue with nodes failing to start pods with the following errors.
SystemD has become unresponsive.
We have a few CronJobs running which I suspect is causing the systemd to eventually be unable to mount new secrets. Increasing the CronJob rate hasn't made much affect.
The closest issue I have seen is https://github.com/kubernetes/kubernetes/issues/57345 but the issue varies slightly. Remoting into the machine systemd is completely unresponsive.
What I know
Any suggestions on where to hunt?