giampaolo / psutil

Cross-platform lib for process and system monitoring in Python
BSD 3-Clause "New" or "Revised" License
10.22k stars 1.38k forks source link

[Linux / FreeBSD] evaluate using `pidfd_send_signal()` for signaling processes #2400

Closed giampaolo closed 2 months ago

giampaolo commented 5 months ago

See: https://copyconstruct.medium.com/seamless-file-descriptor-transfer-between-processes-with-pidfd-and-pidfd-getfd-816afcd19ed4

Long story short: pidfd_getfd() used in conjunction with pidfd_send_signal() would prevent signaling the wrong process PID in case PID is reused by another process. In theory this should not be a problem because psutil already pre-emptively checks if PID has been reused prior to kill(), terminate(), send_signal() and all "set" methods (nice(), ionice(), etc.). Still, the pseudo code below (which represents what psutil does internally) is theoretically subject to a race condition in case PID gets reused between the execution of lines 1 and 2:

if not pid_reused(pid):
     send_signal(sig, pid)

pidfd_getfd() + pidfd_send_signal() would eliminate this theoretical risk at least for kill(), terminate() and send_signal() methods.

It's interesting to note how FreeBSD offers similar syscalls, called pdgetpid() and pdkill(): https://man.freebsd.org/cgi/man.cgi?query=procdesc&sektion=4 macOS doesn't seem to have them though.

Again, this is just a theoretical problem since we already check for PID reuse internally, so it's kind of low priority. Still, I wanted to write this down somewhere for future self-reference.

Explorer09 commented 5 months ago

Hi. I'm not sure if you are going to propose a solution for the kernel side. Since you mentioned about tracking the process creation time in htop-dev/htop#1441, I think a long term solution to the problem is to introduce a new kill() system call that adds a process creation time as an argument as well as PID. Without that it won't help. (I think the kernel developers would likely reject the idea stating it is practically not a problem.)

The pidfd is probably not designed for the use case like ours, where we track the updates of all processes rather than a specific few. pidfd would force the OS to reserve process references to us while a process manager app is not supposed to hold such resources for a long time. It's easy to mess up when managing the pidfd resources.

giampaolo commented 5 months ago

Yes, the fact that pidfd_getfd() can fail with "Too many open files" after 1024 calls basically kills this proposal. I believe pidfd_getfd() is intended to be used by those apps which spawn and handle their own process children. It's not intended for htop-like apps which monitor all PIDs. Still, to me it looks like pidfd_getfd() is also racy:

pid = os.fork()
fd = pidfd_getfd(pid)
pidfd_send_signal(fd)

If PID is reused between line 1 and 2 you have a race. I've researched this topic for a while now, and the impression I got is that this problem is currently unsolvable. Basically as long as PIDs are represented as non-unique numbers, there's nothing a user space app can do.

As for kill() accepting a creation time arg: I'm not a kernel developer, and even if an OS were to support such a function, it wouldn't be cross platform.

I think the pid + create_time pair strategy used by psutil is the best we can do. It's racy the same way pidfd_getfd() is racy, but at least it can be implemented on all platforms. I expect the chances of incurring into the race condition to be very low, and to depend on the OS recycling algorithm. I would expect that the OS won't immediately reuse a PID, but who knows...

Explorer09 commented 5 months ago

@giampaolo

pid = os.fork()
fd = pidfd_getfd(pid)
pidfd_send_signal(fd)

If PID is reused between line 1 and 2 you have a race.

Not if the process is the parent. In Unix-like systems, PIDs of dead children are reserved for parents and thus the PIDs won't be reused until the parent has done with them. Please understand the concept of "zombie processes" carefully.

giampaolo commented 5 months ago

Oh you're right! So pidfd_getfd() is not racy in that case (but it is racy if PID is not a parent's child).

Explorer09 commented 5 months ago

FreeBSD does not allow opening arbitrary processes for file descriptors. The "process descriptors" in FreeBSD can only be created by pdfork(2) so process managers have no way to use them (they are surely designed for parent process only).

Linux has pidfd_open(2), which is what we can use.

giampaolo commented 2 months ago

Closing as per my previous https://github.com/giampaolo/psutil/issues/2400#issuecomment-2047851524.