Closed altheaden closed 1 year ago
I have tested this just now with the most recent version of main. I ran the pr suite against a baseline and everything passed as expected.
@xylar the dev guide documentation changes didn't make it through the rebase (the filename was different and git got confused), so please double check that those changes are how you want them to be.
OK, I just built the documentation locally and I can see some formatting errors, I'll fix those now.
Formatting is fixed now, just some issues converting from rst to markdown that I didn't notice before.
This is a port from changes made in Compass: https://github.com/MPAS-Dev/compass/pull/573
Previously, nothing was preventing
cpus_per_task
from being more than the number of cores on a node. This is inconvenient because unsafe or non-performant behavior will occur if more python threads are used than cores on a node.This merge changes the way resources are constrained. First,
cpus_per_task
is constrained to be less than the lower of the number of cpus on a node or the number of total cores available. Then, we check to see ifcpus_per_task
is smaller than the minimum allowed for the given step. Next, we allow all tasks to have the samecpus_per_task
and constrain the number of tasks not to exceed the total available resources. Finally, we check to make sure the number of tasks is not below the allowed minimum for the step.There are 4 typical resource requirements for steps:
serial (1 task and 1 cpu per task) threading or multiprocessing on a node (1 task, multiple cpus per task) MPI jobs without threading (multiple tasks, 1 cpu per task) MPI jobs with threading (multiple tasks, a few cpus per task) This algorithm should handle all of these without difficulty.
Checklist
Testing
comment in the PR documents testing used to verify the changes