Open christophermaier opened 5 years ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.
Our is_alive function is essentially a fancy version of the libc::kill. As such, it cannot distinguish between a process that exists and is alive, and a process that exists but is a zombie.
With the Supervisor taking on responsibility for shutting down services in #6107, this distinction becomes more important. Since the Launcher is still the parent of service processes, it needs to reap the service processes when they exit. If it doesn't, the Supervisor will still think they're "alive".
Normally, this isn't a problem because the Launcher regularly reaps its child processes. When the Supervisor is shutting down, however, the Launcher needs to continue reaping children as it waits for the Supervisor process itself to shut down. If not, the Supervisor will wait it's allotted 8 second timeout for the service processes to become "not alive" and will then send a
KILL
signal. This can delay shutdown unnecessarily.This works, but it means, among other things, that
is_alive
is not very well named 😄We can likely refactor
is_alive
on Linux to leverage the procinfo crate to distinguish between truly alive processes and zombies. One important wrinkle, however, is that we often callis_alive
with a negative PID, which queries everything in the process group, rather than just a single process. If we were to useprocinfo
, we would need to handle the querying of process group members on our own.See the discussion that spawned all this for further background.
There may be other uses of
is_alive
that need to be taken into account with any refactoring that takes place.