radical-cybertools / radical.saga

A Light-Weight Access Layer for Distributed Computing Infrastructure and Reference Implementation of the SAGA Python Language Bindings.
http://radical-cybertools.github.io/saga-python/
Other
83 stars 34 forks source link

Auto ppn set up might get the wrong value in SLURM adaptor for versions later than 17.11.5 #692

Open iparask opened 5 years ago

iparask commented 5 years ago

On Bridges for example we have SLURM 17.11.7 and it offers at least 2 different core per node counts.

[paraskev@br006 ~]$ scontrol --version
slurm 17.11.7

That is 28 for the RM queue and 32 for the GPU queue when the nodes with the P100 GPUs are used.

Lines 389-391 will select 28 cores per node because that is the output of line 388

[paraskev@br006 ~]$ scontrol show nodes | grep CPUTot| sed -e 's/.*\(CPUTot=[0-9]*\).*/\1/g'| sort | uniq -c | cut -f 2 -d = | xargs echo
28 288 32 352 40 64 80 96

We need to come up with a way to select the correct value. In addition, Bridges does not require the -N flag in the SLURM script, while Stampede2 with SLURM 18.08.3 does.

iparask commented 5 years ago

Hey @andre-merzky , do you remember how we said we are going to tackle this? I remember I would pick it up, but I missed it in my todo list

iparask commented 5 years ago

It took me a second, but I remember it!

The solution was to execute scontrol show partitions | grep -E 'PartitionName|TotalCPUs|TotalNodes' instead of the command executed now.

On Bridges this returns:

PartitionName=RM
   State=UP TotalCPUs=20160 TotalNodes=720 SelectTypeParameters=NONE
PartitionName=RM-shared
   State=UP TotalCPUs=1932 TotalNodes=69 SelectTypeParameters=NONE
PartitionName=RM-small
   State=UP TotalCPUs=140 TotalNodes=5 SelectTypeParameters=NONE
PartitionName=GPU
   State=UP TotalCPUs=1344 TotalNodes=44 SelectTypeParameters=NONE
PartitionName=GPU-shared
   State=UP TotalCPUs=700 TotalNodes=23 SelectTypeParameters=NONE
PartitionName=GPU-small
   State=UP TotalCPUs=128 TotalNodes=4 SelectTypeParameters=NONE
PartitionName=GPU-AI
   State=UP TotalCPUs=456 TotalNodes=10 SelectTypeParameters=NONE
PartitionName=LM
   State=UP TotalCPUs=4512 TotalNodes=46 SelectTypeParameters=NONE
PartitionName=XLM
   State=UP TotalCPUs=1280 TotalNodes=4 SelectTypeParameters=NONE
PartitionName=DBMI
   State=UP TotalCPUs=256 TotalNodes=8 SelectTypeParameters=NONE
PartitionName=DBMI-GPU
   State=UP TotalCPUs=64 TotalNodes=2 SelectTypeParameters=NONE

If we take every partition and we calculate the ppn per node, by dividing TotalCPUs with TotalNodes, I get the following dictionary:

{'DBMI': 32,
 'DBMI-GPU': 32,
 'GPU': 31,
 'GPU-AI': 46,
 'GPU-shared': 31,
 'GPU-small': 32,
 'LM': 99,
 'RM': 28,
 'RM-shared': 28,
 'RM-small': 28,
 'XLM': 320}

This is wrong because the GPU nodes have either 28 or 32 cores based on the type of gpu we are selecting. I propose to keep the ppn checking as it was and strict the version checking for Stampede2 and what it was already. Is that okay with you?