When container processes get stuck blocked in kernel code (e.g. a connection disappears mid-write to an NFS volume store), the processes are unkillable even with kill -9.
This change adds the following:
updated log messages when kill -9 has failed, indicating that the VM running garden must be rebooted.
a new UnkillableContainers metric is emitted that can be used to determine if a VM has any containers that garden was unable to kill.
We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.
The labels on this github issue will be updated when the story is started.
When container processes get stuck blocked in kernel code (e.g. a connection disappears mid-write to an NFS volume store), the processes are unkillable even with
kill -9
.This change adds the following:
UnkillableContainers
metric is emitted that can be used to determine if a VM has any containers that garden was unable to kill.