Open jhoekx opened 2 years ago
from an outside black box point of view this issue smells like round trips between processes when more than one read or write call are required on the pipe is taking a surprising amount of time.
While they could have different causes, it wouldn't surprise me if #98493 is found to be related.
Bug report
We are using multiprocessing with the
spawn
start method. On my 32-thread PC, starting all worker processes for my project used to take 2 seconds. At a certain point, it jumped straight to taking 20 seconds.The slowdown appears as soon as more than 64 KB needs to be sent to a child process over the pipe.
Consider this minimal reproduction case:
I added some "instrumentation" in
multiprocessing/popen_spawn_posix.py
to print the buffer size:Running the example results in:
Changing the pipe size with
fcntl
inmultiprocessing/popen_spawn_posix.py
restores performance:Where 1031 is
fcntl.F_SETPIPE_SZ
, which is not in Python 3.9.Rerunning the reproduction case after this change:
Of course, changing the pipe size will only delay the onset of the problem. The real solution (if there is any) will probably be different. Blindly setting a pipe size might also not be safe as it depends on limits set in
/proc
.The example above is a best case example, since it has very limited pickle overhead. We hit this limit without any data caches involved. It's just our Python objects that live after application initialization. They are slower to pickle. However, then things are still 10x slower, so not just a fixed 80ms as seen in the example.
We use
spawn
instead offork
on Linux to avoid troubles with objects that cannot be pickled on other OSs (Windows).Your environment