Open tleonardi opened 6 years ago
I have to look into the issue more carefully, not sure why/when it happens, but it happened more than once already. @a-slide do you have any idea?
Ok, it looks like I figured it out. On my system numpy is built against openBlas, which by default is multithreaded. The result is that the np.array() call in __process_references() spawns multiple threads (and every worker process does the same). Since we are using multiprocessing, the solution seems to be to disable multithreading for openBlas and mkl before importing numpy:
import os
os.environ["MKL_NUM_THREADS"] = "1"
os.environ["MKL_THREADING_LAYER"] = "sequential"
os.environ["NUMEXPR_NUM_THREADS"] = "1"
os.environ["OMP_NUM_THREADS"] = "1"
I'm currently testing whether it works as it should.. will commit as soon as I'm sure all is fine.
I did not notice, but my version is actually also build against OpenBlas. I had a quick look as well and it looks like the method you describe should work but you might want to include that as well
os.environ['OPENBLAS_NUM_THREADS'] = '1'
Not completely fixed apparently. Numpy is still causing issues in a cluster environment. An option to explore might be to use this package to set the number of threads: https://github.com/joblib/threadpoolctl
This should be fixed by #94 but I haven't tested it yet. Did you?
I think it's fixed
But it's not, reopening.
It looks like nanocompore sometimes spawns more threads than it should.. Starting it with nthreads=4 with the 7SK IVT data starts 16 threads.