Open standage opened 4 years ago
Can you not control the CPU usage of your third party program with command line switches of environmental flags? In many numerical Python programs, it is essential to ensure that linear algebra libraries run in single-threaded mode when using xdist since by default, one would end up with ncores*ncores
active threads all competing for ncores
logical CPUs. Forgetting to set MKL_NUM_THREADS=1
before a parallel pytest makes the local machine virtually unresponsive and makes the total test time explode due to excessive context switching.
@bashtage Unfortunately no, this is a scientific app distributed as a closed-source binary. Neither the limited documentation nor the program's usage statement mention any way to control CPU utilization.
Xdist currently has no mechanism to do that kind of scheduling
Unfortunately no, this is a scientific app distributed as a closed-source binary. Neither the limited documentation nor the program's usage statement mention any way to control CPU utilization.
Could you wrap the call to the binary in another program that would limit its CPU usage, e.g., cpulimit?
@bashtage I'll have to look into that. It looks like cpulimit
requires admin privileges?
You could play around with implementing your own scheduler using the pytest_xdist_make_scheduler
hook.
@nicoddemus I'll take a look, thanks!
One of my workflows runs a third-party program that runs at 300-400% CPU utilization with no way to control the number of cores/threads used. When I run the workflow's test suite with
pytest -n 8
, eight test functions will run simultaneously. But the actual CPU utilization could be much higher depending on how many tests running this third-party program are active at the same time. I can do a bit of quick math to compensate and run fewer parallel worker processes, but then the rest of the test suite runs more slowly when that program is not being run.If there were a way to mark the relevant test functions and say "this function consumes 4 CPUs", then it would be possible to strike the ideal balance between using all available CPUs and not overwhelming the system.