flatironinstitute / disBatch

Dynamic dispatch of a list of command-line tasks, locally or on a cluster. Supports retrying failed tasks, and adding/removing compute resources on-the-fly.
Apache License 2.0
39 stars 9 forks source link

Make `SLURM_CPU_BIND=cores` the default #29

Closed lgarrison closed 1 year ago

lgarrison commented 1 year ago

The main change here is to turn on CPU binding by default. For multi-threaded tasks, this will make most thread pools (like OpenMP) spawn the right number of threads.

Note that disBatch effectively treats the sbatch value of -c as a minimum number of cores per task. So the following will result in 64 cores per task on a 128 core node:

salloc -n2 -c2 --exclusive disBatch tasks

One can pass -c2 to disBatch again to get 2 cores per task:

salloc -n2 -c2 --exclusive disBatch -c2 tasks

This is probably all fine and working as intended; users typically don't have an upper limit on the number of threads per task.