merenlab / illumina-utils

A library and collection of scripts to work with Illumina paired-end data (for CASAVA 1.7+ pipeline).
GNU General Public License v2.0
89 stars 31 forks source link

default number of cores #31

Open EricDeveaud opened 1 year ago

EricDeveaud commented 1 year ago

Hello

rapidmerge.py uses multiprocessing.cpu_count() to ge the number of available cpuswhich retruns the number of cpu in the machine. But this is not the same as the number of cpu available to the process. For example, you can run in a taskset context or a batch scheduler like slurm.

see:

$ nproc
96
$ taskset -c 1 nproc
1
$ taskset -c 1 python3 -c "import multiprocessing; print(multiprocessing.cpu_count())"
96

I would suggest to use len(os.sched_getaffinity(0)) instead of multiprocessing.cpu_count()

$ python3 -c "import os; print(len(os.sched_getaffinity(0)))"
96
$ taskset -c 1 python3 -c "import os; print(len(os.sched_getaffinity(0)))"
1

regards

Eric

meren commented 1 year ago

Dear @EricDeveaud, thank you very much for pointing this.

Would you be interested in providing a PR so this contribution includes your name in the archives?

Best wises,

EricDeveaud commented 1 year ago

done: https://github.com/merenlab/illumina-utils/pull/32