Consider `waitpid` - Githubissues

Jake-Shadle commented 7 months ago

As mentioned in this PR, we could use waitpid first and then fallback to the current implementation of reading the process status from /proc/<pid>/stat if the waitpid fails for some reason. However, we currently have a timeout (defaulting to 100ms) waiting for the state to change, before going ahead with the process dumping, meaning if we wanted to keep a timeout/best effort stopping, we'd need to do something similar to this to be able to timeout the waitpid if it takes too long. In most cases I'd assume the waitpid would succeed/fail quickly and the timeout wouldn't matter, but that's not guaranteed, and the current polling solution, while less efficient, is quite simple.

afranchuk commented 7 months ago

Wait, are the processes likely to be children of the minidump-writer? According to the man page they can strictly only be direct child processes, otherwise ECHILD is returned.

Jake-Shadle commented 7 months ago

Oops, I didn't read that far, yah in most cases it will actually be the opposite, the process doing the minidump writing is the parent.

gabrielesvelto commented 7 months ago

Yeah, my suggestion was because there's only two cases here when using WUNTRACED: either the processes are in the same process group then the stopped process will indeed stop and waitpid() will return (which is what we want), or it's not, in which case you get ECHILD and have to proceed with the regular method of stopping one thread at a time. BTW I don't know how the kernel handles a race where a SIGCONT is sent after a SIGSTOP but before the process actually stopped, but I guess it should be handled sanely.

Jake-Shadle commented 7 months ago

It'd be interesting to try and create a test case for that, but I don't have a clue how one could reliably get into such a situation.

afranchuk commented 7 months ago

@gabrielesvelto forgive me if I'm missing the detail, but you didn't mention the child relationship in the first case of your comment. As far as I understand, waitpid cannot be generalized to just the process group, those processes must also still be direct children of the calling process (i.e., waitpid won't work on process group siblings/ancestors/parents/etc). Is that still a case we think is likely enough to attempt to account for?

Wrt SIGCONT/SIGSTOP, I've been assuming that is handled sanely and there is no race. According to https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html:

When any stop signal (SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU) is generated for a process or thread, all pending SIGCONT signals for that process or any of the threads within that process shall be discarded. Conversely, when SIGCONT is generated for a process or thread, all pending stop signals for that process or any of the threads within that process shall be discarded.

But that doesn't guarantee that the Linux implementation meets that requirement.

gabrielesvelto commented 7 months ago

@gabrielesvelto forgive me if I'm missing the detail, but you didn't mention the child relationship in the first case of your comment. As far as I understand, waitpid cannot be generalized to just the process group, those processes must also still be direct children of the calling process (i.e., waitpid won't work on process group siblings/ancestors/parents/etc). Is that still a case we think is likely enough to attempt to account for?

Yeah, waitpid()'s Linux documentation confused me a bit, I was under the impression that siblings in a process group could waitpid(..., WUNTRACED) each other after a SIGSTOP. I tested it and that's not the case, so that's settled.

As @Jake-Shadle said the process writing the minidumps is often a parent of the minidumped processes, but not always. We could leverage this to do the following:

Send SIGSTOP to the target process
Call waitpid(..., WUNTRACED) on the target process, if it succeeds we're good. We don't need to deal with a timeout here, because we know that waitpid() will return after a certain amount of time (if the target process is a child) or will return immediately with a failer (if it isn't), but we can't get stuck on it
- If waitpid() failed with ECHILD then try polling /proc/<pid>/stat. Here we do need the timeout as we could be racing against another process sending a SIGCONT to the target process and we'd never see the status become stopped

There's another upside to using waitpid() where possible. In the case of Android we might not have to access to /proc, but we'd still be able to waitpid() a child. Given this is a best effort attempt I think it's good to cover all scenarios. Unless there's a downside I'm missing.

Wrt SIGCONT/SIGSTOP, I've been assuming that is handled sanely and there is no race. According to https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html:

When any stop signal (SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU) is generated for a process or thread, all pending SIGCONT signals for that process or any of the threads within that process shall be discarded. Conversely, when SIGCONT is generated for a process or thread, all pending stop signals for that process or any of the threads within that process shall be discarded.

But that doesn't guarantee that the Linux implementation meets that requirement.

I've checked the kernel just in case, it implements this precisely this way. Once SIGCONT arrives all pending SIGSTOP signals are cleared from the process' threads and they're all restarted regardless of whether they had already stopped or not.

rust-minidump / minidump-writer

Consider `waitpid` #111