Closed a1291762 closed 1 year ago
Sounds good. Since we're no longer reading the pid from sigchld_pipe
, we should probably drain this separately, so the SIGCHLD handler doesn't (eventually) block on a write.
This may warrant changing what we're actually writing to the pipe, since we no longer need the pid...
Hmm. Good point about the pipe. I'll restore the read to this patch.
I won't change the write/read of a pid_t since I doubt it makes much difference (compared to say, a single char).
Looks good; just one open item though: the read()
will consume one write()
from the signal handler at a time (and trigger a waitpid() on all current children). We might have multiple write()
s happen before we enter sigchld_pipe_event
, and so more than one read()
to perform.
Overall, this just means we'll re-enter sigchld_pipe_event
after the next poll - we could avoid that by totally draining the sigchld_pipe
instead of doing the single read.
I'm okay with either option - but if you prefer the drain, I can wait for an update.
Hmm... Yeah, I see what you mean. I guess it is rare to enter sigchld_pipe_event after multiple writes anyway? I think we only see it because we have a single, slow, ARM core are we're spawning several downloads in parallel.
To be honest, I'm not sure how much point there is in trying to avoid calling this function when there's nothing to reap. Maybe something "best effort" like this?
/* Drain the pipe (there may be multiple queued pids) */
do {
rc = read(procset->sigchld_pipe[0], &pid, sizeof(pid));
} while (rc == sizeof(pid));
This has no error checking, but the previous code didn't have error checking either. In the event of a transient error (eg. EAGAIN) there will be another pipe event because the pipe still contains data to read. The primary difference here is that child reaping happens on this iteration rather than the next one. Even on our slow ARM core, I wouldn't expect to see an observable performance delta due to this change but maybe it makes sense from a "correctness" perspective?
Yeah, it's all fairly trivial anyway; there's no harm in calling sigchld_pipe_event
multiple times, aside from causing a few extra syscalls.
Your looping read
will end up blocking though - if we do anything about this, maybe just a larger buffer size, and permit shorter reads?
I'm fine with the current code though - let me know if you'd prefer a merge as-is.
Your looping read will end up blocking though
Ha! Well that's why when I built this and threw it on my device, the boot menu didn't populate any entries... :facepalm:
If the read is going to block, I'd suggest just merging as-is. It works, even if it may cause the pipe handler a few more times than it needs to.
Sounds good!
Merged to my tree, will push shortly. Thanks for the contribution.
SIGCHLD is not guaranteed to be sent for all children, especially if 2 children die at the same time.
Instead of trying to find the specific child that died, just reap all of the async processes that have finished running.
This removes a hang observed when downloading an image over the network. 5 wget processes are started, and some of them finished close enough together that only a single SIGCHLD was delivered to pb-discover.
Signed-off-by: Lincoln Ramsay lincoln.ramsay@opengear.com