Open liyu1981 opened 10 months ago
The problem analysis is correct.
However, I disagree to bundle multiple return values to enforce upper time bounds of return values, since the user would need to handle all the edge cases now and those are hidden from the user.
Instead we should use WNOHANG in https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/functions/waitid.html and read siginfo_t https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html to fix the more structural problem of incoming signaling being racy, for example if 2 or more children terminate at the same time.
Multiple identical signals from the same process could still be lost, which is a more fundamental OS design problem.
This would still not fix the sending part being racy, see pidfd #18508 for that (sadly linux only).
Zig Version
0.12.0-dev.1828+225fe6ddb
Steps to Reproduce and Observed Behavior
let us consider a simple c program
it will raise SIGTSTP to itself cause itself be stopped after execution. (In real life this could be a long run external program be kill -STOP by other user). Compile it with
zig cc exit_stop.c -o exit_stop
if I use following zig code to run it (within same dir of exit_stop)
now run it
zig test run_child_test.zig
, will seerun_child_test.zig
run forever, and will not return arr
withrr.term.Stopped
be true.The reason for this is, as I investigated a bit, in
std/child_process.zig
, around line 406, I seein either
mark 1
ormark 2
, we call os.wait4 with option0
, and if we read https://pubs.opengroup.org/onlinepubs/9699919799/functions/waitpid.html, we can find that only with optionwill enable the return from a stopped child process. So if we use option 0, will not return.
The second founding is that in later line 484 of
child_process.zig
both
mark1
andmark2
should be unreachable or @panic. For mark1, we know before we fix the above issue, it can never be true. For mark2, if we read https://pubs.opengroup.org/onlinepubs/9699919799/functions/exit.html, we will notice thatso
status
will never be bigger than int 255, so it will never reach end of our ifs.Expected Behavior
at least first SIGSTOP part should be fixed, by defaultly use
WUNTRACED | WCONTINUED
as suggested by POSIX, or allow user to pass in option param.for second Unknown, strongly suggest to change to @panic, so if really that happens can leave some traces.