ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
35.26k stars 2.57k forks source link

os.waitpid does not handle os.WNOHANG flag #7143

Open stef opened 4 years ago

stef commented 4 years ago

in a forking server i want to check if any of the forked worker processes has finished processing, and if so reap this zombie, but i do not want to block on this. thus i pass os.WNOHANG to waitpid. Unfortunately if there is no zombie child waitpid returns with ECHILD, which triggers a unreachable. the comment in the current implementation is wrong for this case: https://github.com/ziglang/zig/blob/master/lib/std/os.zig#L3263

excerpt from the implementation:

        const rc = system.waitpid(pid, &status, flags);
        switch (errno(rc)) {
            0 => return .{
                .pid = @intCast(pid_t, rc),
                .status = @bitCast(u32, status),
            },
            EINTR => continue,
            ECHILD => unreachable, // The process specified does not exist. It would be a race condition to handle this error.
stef commented 4 years ago

i worked around this by lifting the implementation from os.waitpid() and modifiying it to the minimum needed to get this working for my use-case:

const Status = if (builtin.link_libc) c_uint else u32;             
var status: Status = undefined;                                    
const rc = os.system.waitpid(-1, &status, os.WNOHANG);             
if(rc>0) {                                                         
    try kids.del(rc);                                              
    if(cfg.verbose) warn("removing done kid {} from pool\n",.{rc});
}
ghost commented 1 year ago

Strange: the manual page for waitpid seems to say that waitpid should return 0 in this case, and not an error.

matu3ba commented 1 year ago

Sounds like you reaped the child before and then check for it via waitpid, which is a valid thing, since the child may ignore SIGABRT. However, Zig's waitpid is not designed for that use case and would need an optional argument specifying the expected behavior.

I dont remember and cant find right now the correct call to signal a process with a specific process group (I think that one was not portable) and likewise the best way to check, if the process with pid and process group is still running.