I have a recurring problem with btlapply, whichever code I use inside. I'm not sure if it's a bug in batchtools, a problem in the way I use it or an error in my Slurm configuration.
I have 16 nodes, each with 40 CPUs. Whatever I submit using btlapply, it never uses more than 1 or 2 nodes. And even the second node is used half.
sinfo reports all the other nodes in idle state.
srun -N 16 hostname will correctly print the hostname of each node, meaning I can send jobs to all the nodes.
Any idea or suggestions on how to fix this problem or where to look at?
Hi,
I have a recurring problem with
btlapply
, whichever code I use inside. I'm not sure if it's a bug in batchtools, a problem in the way I use it or an error in my Slurm configuration. I have 16 nodes, each with 40 CPUs. Whatever I submit usingbtlapply
, it never uses more than 1 or 2 nodes. And even the second node is used half.sinfo
reports all the other nodes inidle
state.srun -N 16 hostname
will correctly print the hostname of each node, meaning I can send jobs to all the nodes.Any idea or suggestions on how to fix this problem or where to look at?