Closed anton-seaice closed 4 months ago
This looks pretty good. A couple thoughts.
The modifications I'm working on handle the nprocs thing in much the same way, but also refactor the max_blocks calculation so each proc (mpi task) has a locally defined max_blocks that matches the decomposition exactly. I'm still developing and testing to make sure it'll work properly.
I'm happy to merge this as a first step (if the CESMCOUPLED thing is fixed). I could then update the implementation if I make additional progress. Thoughts?
I'm happy to merge this as a first step (if the CESMCOUPLED thing is fixed). I could then update the implementation if I make additional progress. Thoughts?
Lets go with this. I think I made the related changes to do this.
I guess we should add some tests?
I will run a full test suite once comments are addressed and the PR is no longer draft. Thanks!
I will run a full test suite once comments are addressed and the PR is no longer draft. Thanks!
My question here is whether we should add some test cases for nprocs=-1
. It may be a bit of work though, because most of the scripts are written around setting this explicitly.
I will run a full test suite once comments are addressed and the PR is no longer draft. Thanks!
My question here is whether we should add some test cases for
nprocs=-1
. It may be a bit of work though, because most of the scripts are written around setting this explicitly.
Please run a test manually with nprocs=-1 to make sure it works. I think the question is maybe whether we should have nprocs=-1 be the default setup for all tests. You could try that and see if everything runs and is bit-for-bit. I think you'd need to set nprocs=-1 in configuration/scripts/ice_in and then remove "nprocs = ${task}" in cice.setup. Is that the direction we want to go?
Our intermittent failure is back !
https://github.com/CICE-Consortium/CICE/actions/runs/9024916341/job/24799676376 failed https://github.com/ACCESS-NRI/CICE/actions/runs/9024916168/job/24799675998 passed
But they are the same commit
I restarted the github action failure and it passed. Ugh....
PR checklist
[ ] Short (1 sentence) summary of your PR: Contributes to #945
This improves setting nprocs and max_blocks automatically.
[ ] Developer(s): @anton-seaice @minghangli-uni
[ ] Suggest PR reviewers from list in the column to the right. @apcraig
[ ] Please copy the PR test results link or provide a summary of testing completed below. Needs doing
How much do the PR code changes differ from the unmodified code?
Does this PR create or have dependencies on Icepack or any other models?
Does this PR update the Icepack submodule? If so, the Icepack submodule must point to a hash on Icepack's main branch.
Does this PR add any new test cases?
Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)
[x] Please document the changes in detail, including why the changes are made. This will become part of the PR commit log.
This change allows
nprocs
to be set to-1
in 'ice_in' and then the number of processors will be automatically detected.This change improves the automatic calculation of
max_blocks
to give a better (but still not foolproof) estimate of max_blocks if it is not set inice_in
.