Open dilpath opened 6 months ago
Well technically even if the process runs on a single core it could still use multiple threads via hyper-threading, but overall this is interesting and I wasn't aware of it. Good that I usually snakemake for benchmarking which externally limits the number of threads to a fixed number.
numpy recommends use of https://github.com/joblib/threadpoolctl to limit use of threads in native libraries. Probably makes sense to make this configurable similar to the n_threads
attribute in AmiciObjective
, but at an engine
level.
I would keep that out of pypesto. Just a comment to the respective optimizers. Those settings might affect a number of libraries, and the control is best left to the user.
a comment to the respective optimizers
If it's made very clear to the user (e.g. a warning) then I agree, otherwise it could be an easy thing to overlook, with perhaps a big performance penalty. Limiting thread use to 1
, halves the wall time for np.linalg.eig(65, 65)
in my test script, for example, and this is without parallelized multi-starts. cma
seems popular, and with np.linalg.eigh(33, 33)
I just saw a reduced wall time by a factor of 6 when limited to 1 thread. I'm not sure why limiting to 1 thread is faster... I guess it becomes slower at very-large-dimension matrices.
Bug description Despite using
SingleCoreEngine
, all CPUs were at 100% utilization.After profiling, it looks like this is due to use of
np.linalg.eig
ornp.linalg.eigh
. For example, the defaultScipyOptimizer
does not have this issue.FidesOptimizer
andCmaesOptimizer
do have this issue.Profiling was done with another script using pyPESTO optimization. Here's a small demonstration of the issue with
np.linalg.eig/h
directly.np.linalg.eig
np.linalg.eig
np.linalg.eig
np.linalg.eig
np.linalg.eig
np.linalg.eig
np.linalg.eigh
np.linalg.eigh
np.linalg.eigh
np.linalg.eigh
np.linalg.eigh
np.linalg.eigh
np.linalg.eig
seems to switch to using all CPUs when the number of parameters is >64.fides
usesnp.linalg.eig
.np.linalg.eigh
seems to gradually increase the number of CPUs used.cma
usesnp.linalg.eigh
.Overall, just something to keep in mind when expecting single-core behavior -- this could affect benchmarking, for example. This also affects the efficiency when parallelizing optimization, since with large problems, potentially all starts will try to use all CPUs simultaneously when computing eigenvalues/vectors.
Expected behavior Approximately one CPU should be utilized 100% in all cases, when using
SingleCoreEngine
.Environment
pypesto
version: current develop, with NumPy 1.24.3