lanl / Pavilion

HPC testing harness
BSD 3-Clause "New" or "Revised" License
16 stars 12 forks source link

slurm number of pes not jiving #33

Closed cadejager closed 7 years ago

cadejager commented 7 years ago

David has reported the following:

During my testing today on trinitite, I found that there is a discrepancy in the number of pes allocated to a job between what pavilion thinks it should be and what slurm ends up in. Every time, slurm thinks that all of the pes are part of the job, no matter what procs_per_node is set to in the slurm stanza.

As an example, if I set procs_per_node to 1 in my yaml file, PV_NPESPERNODE is set correctly in the allocation, but SLURM_TASKS_PER_NODE is always set at the maximum for that machine. I don't think this is a problem with slurm as I can use '--ntasks-per-node 1' on the command line and SLURM_TASKS_PER_NODE is set to 1 in the allocation.