famzah / popen-noshell

A much faster popen() and system() implementation for Linux
68 stars 13 forks source link

Don't request SIGCHLD be sent to the parent #12

Closed nicowilliams closed 7 years ago

nicowilliams commented 7 years ago

For background, Solaris/OpenSolaris(RIP)/Illumos have a forkx() and vforkx() that takes a flags option. One of the flags is to not send SIGCHLD to the parent when the child dies. The popen() and system() implementations in those systems do request that SIGCHLD not be sent: because if the parent happens to have a handler for that signal, it can reap the child opened by popen() (or system()), which then makes it impossible to retrieve its exit status in pclose() or system().

nicowilliams commented 7 years ago

Thanks!

famzah commented 7 years ago

For the record, the above commit was reverted. SIGCHLD is required: https://github.com/famzah/popen-noshell/commit/64c51f5099af9da3c75f53cd3d4a74e4a2117b49

nicowilliams commented 7 years ago

What is it required by? Your existing consumers? I can understand that, though SIGCHLD is not reliable in general, it might be for your users' use-cases.

famzah commented 7 years ago

It's required so that you can wait for and get the exit status of the child process. Otherwise I can't re-implement pclose().

Can you do waitpid() for a child process when SIGCHLD is never sent? My (very fast) tests show that in such a case the terminated child process is reaped automatically, and it's too late when we call waitpid() in the parent.

nicowilliams commented 7 years ago

You know that the child exited when either the pipe to its stdin gets EPIPE or SIGPIPE or when you get EOF from its output pipe. At that point you can call waitpid(2). And yes, you absolutely can call waitpid(2) and waitid(2) with a specific PID without having received SIGCHLD.

Oh. But maybe Linux reaps it automatically if you didn't get SIGCHLD? That isn't documented in the clone(2) man page. I can believe that it's true, but that would be very disappointing! Normally automatic reaping is only supposed to happen if you ignore SIGCHLD.

Well, I guess I've wasted your time here :(

famzah commented 7 years ago

I haven't spent much time to research it. While running the unit tests, waitpid() failed with ECHILD (No child processes).

Here is a sample strace:

famzah@vbox64:~/popen-noshell$ gcc -Wall popen_noshell.c popen_noshell_tests.c -o popen_noshell_tests && strace -f ./popen_noshell_tests

mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb3b7a03000
clone(strace: Process 4311 attached
 <unfinished ...>
[pid  4311] open("/dev/null", O_RDWR)   = 5
...
[pid  4311] execve("/", ["/"], [/* 80 vars */]) = -1 EACCES (Permission denied)
# this strange execve() is part of the unit test, so it's expected
...
[pid  4311] exit_group(255)             = ?
[pid  4310] <... clone resumed> child_stack=0x7fb3b8203000, flags=CLONE_VM|CLONE_VFORK) = 4311
...
[pid  4310] wait4(4311, 0x7fffeb822d74, 0, NULL) = -1 ECHILD (No child processes)

If I set SIGCHLD in the clone() flags, then the strace looks a bit differently:

...
[pid  4355] exit_group(255)             = ?
[pid  4354] <... clone resumed> child_stack=0x7f6cedde3000, flags=CLONE_VM|CLONE_VFORK|SIGCHLD) = 4355
...
[pid  4354] wait4(4355,  <unfinished ...>
[pid  4355] +++ exited with 255 +++
<... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 255}], 0, NULL) = 4355
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=4355, si_uid=1000, si_status=255, si_utime=0, si_stime=0} ---
munmap(0x7f6ced5e3000, 8392704)         = 0
...
nicowilliams commented 7 years ago

Oh, check this:

1976        else if ((clone_flags & CSIGNAL) != SIGCHLD)
1977            trace = PTRACE_EVENT_CLONE;

It seems you may need to add __WCLONE to the wait options. Can you try that?

Also, if you make the child sleep for a long time, can you see with ps(1) or /proc if it got reaped? I bet it didn't get reaped, it just isn't visible to waitpid(2) unless you add __WCLONE to the wait options.

nicowilliams commented 7 years ago

Aha! Just use __WALL in the waitpid(2) options. __WCLONE didn't work in all cases, but __WALL did:

diff --git a/popen_noshell.c b/popen_noshell.c
index c935cf4..0a7b0ab 100644
--- a/popen_noshell.c
+++ b/popen_noshell.c
@@ -352,7 +352,7 @@ pid_t popen_noshell_vmfork(int (*fn)(void *), void *arg, void **memory_to_free_o
                 */

 #ifndef POPEN_NOSHELL_VALGRIND_DEBUG
-               pid = clone(fn, stack_aligned, CLONE_VM | SIGCHLD | CLONE_VFORK, arg);
+               pid = clone(fn, stack_aligned, CLONE_VM | CLONE_VFORK, arg);
 #else
                pid = fork(); // Valgrind does not support arbitrary clone() calls, so we use fork for the tests
 #endif
@@ -649,7 +649,7 @@ int pclose_noshell(struct popen_noshell_pass_to_pclose *arg) {
                return -1;
        }

-       if (waitpid(arg->pid, &status, 0) != arg->pid) {
+       if (waitpid(arg->pid, &status, __WALL) != arg->pid) {
                return -1;
        }
$ cc -o t popen_noshell.c popen_noshell_tests.c
$ ./t
Tests passed OK.
$ 
nicowilliams commented 7 years ago

So, this gets pretty close to Illumos forkx(2) semantics, as long as all users of __WALL in the process wait for specific PIDs.

nicowilliams commented 7 years ago

Of course, for an actual patch you'll want to do something like:

#ifdef __WALL
#define sig_request 0
#else
#define __WALL 0
#define sig_request SIGCHLD
#endif
famzah commented 7 years ago

Right. Thanks for the research! I've committed this. Tests pass OK, indeed.

The flag __WALL is available since Linux 2.4 and therefore there is no need to test for its existence. I assume that everyone has already upgraded their systems at least to Linux 2.6.

nicowilliams commented 7 years ago

Yes, 2.6 is good enough!