Closed aaaaalbert closed 9 years ago
Excellent issue description. I vote to transfer some of this informative text to daemon.py
's docstring header; it's concise and easy to understand. :D
All that's missing is a comment explaining why a module might need to be daemonized (could it be to return a terminal to the user?). We should add a comment of this sort, here: https://github.com/SeattleTestbed/nodemanager/blob/master/nmmain.py#L482-L483
Thanks for the kind words :-)
Tests run fine on Mac and Linux! I'll get hold of a Windows box and test there, too.
Tested on Windows 7 -- works correctly.
I'll summarize my findings and send a pull request.
(I decided to write up what I learned over the last few days so I have a single reference in the future. Let me know what parts of it you think should go into the docstring, if any :-) )
or: Why are we doing this convoluted thing again?
Goals of a daemon process:
cron
, ...Notes:
fork()
copies the existing parent process and creates a new child process. Execution in both processes continues where the fork
function call returns: In the parent, it returns the new child's process ID (PID); in the child, it returns 0. The child process inherits the parent's file descriptors, including stdin
, stdout
, and stderr
.wait()
makes a parent process block until one of its immediate children (but not grandchildren etc.) exits.KILL
), etc.fork()
to fork off Child 1. The parent process wait()
s for Child 1 to exit.
wait()
ing for Child 1 to terminate.setsid()
, creating a new session, becoming its leader, and also becoming the process group leader. (Its leadership will become important only after the next fork()
, see below).
chdir
into /
, and set its umask
to 0. Alternatively, this might be done in Child 2 instead.)wait()
ing for Child 1, so if Child 1 would continue to run, this would keep Parent alive too.fork()
itself, creating Child 2 which is neither the process group nor session leader, and therefore cannot reacquire the controlling terminal. Note that Parent does not wait()
for Child 2, as this is a grand-child.
init
process will adopt it soon. The consequence of Child 1's exit is that Parent can exit now, too. Eventually, we are left with only Child 2 which is now a daemon:Note that in contrast to traditional lore, the process ID of the init
process (initID
above) is not necessarily 1
. Upstart (and possibly other init
replacements) has init --user
processes with different PIDs for graphical sessions aka "User Session Mode".
Further reading:
Code samples:
init
): http://code.activestate.com/lists/python-list/209948/I think the following sections should be included in daemon.py
's docstring header:
Goals of a Daemon Process
.Further Reading
.Include a link to this Github issue and/or a separate Wiki document that contains the rest of the informative bits about daemonic processes.
Implemented as suggested by @vladimir-v-diaz in #116.
Seattle has a
daemon
library that helps daemonize the nodemanager process on Unix-like systems. It does this using a variant of the usual "double-fork" approach, which can be summarized as follows (with some terminology borrowed from here:fork()
s off the first child process and exits.fork()
s off the second child process and exitsdaemon.daemonize()
function returns. (Note: Waiting here for child 1 to exit appears to be non-standard, but should not hurt either if done correctly.)Our implementation of the last bit is what gets us in trouble: The code waits for the
init
process to adopt child 2 after child 1 exited, and tries to detect this by waiting for its parent process ID (ppid
) to become 1,init
's usual PID.However, Upstart, Ubuntu Linux's current
init
replacement, has a User Session Mode that results in multipleinit
processes running on the system, one "classical" (PID 1) and another one that spawns processes from interactions with the GUI (such as a terminal window),init --user
, with a different PID. This makes our code loop indefinitely.What we should do instead can be derived from the table below, showing an instrumented version of
daemon.py
going through the different phases of forking:setsid
bash
which also is the session leader (i.e. the sessionID
is itsPID
). The parent started the process group,PGRP
.PPID
, still within the same session and process group. The child callssetsid()
now, resulting in...init --user
.Fix for
daemon.py
: In order to wait for Child 1 to exit, Child 2 can check whether itsPGRP
orSID
equals thePPID
it sees. If not, Child 1 has exited, and Child 2 can continue.I'm currently testing this patch on Mac OS X 10.6.8 with Python 2.7.8, and Ubuntu 14.04.1 with Python 2.7.6 running inside VirtualBox 4.3.20.