E3SM-Project / polaris

Testing and analysis for OMEGA, MPAS-Ocean, MALI and MPAS-Seaice
BSD 3-Clause "New" or "Revised" License
6 stars 13 forks source link

Fix slurm handling of allocatable cores #181

Closed xylar closed 8 months ago

xylar commented 8 months ago

Some systems like Frontier have cores that aren't allocatable. These need to be excluded from the core count that Polaris determines from slurm.

Checklist

xylar commented 8 months ago

Testing

With this fix, Polaris tasks that use more than one Frontier node run successfully, whereas they fail because they try to run on 64 cores per node (the total, rather than the allocatable number) without this fix.

xylar commented 8 months ago

This needs to be tested on Chrysalis and Perlmutter to make sure it doesn't break anything there.

xylar commented 8 months ago

This approach didn't work on Compy.