Closed liujake closed 9 months ago
The workflow should run out of box without local changes for those standard settings, and with a proper/optimal use of derecho nodes. This PR is one of more PRs to gradually fix most of scenarios we are frequently working with.
During the tests, I also found another issue. See https://github.com/NCAR/MPAS-Workflow/issues/280.
Description
I ran cycling tests for the standard 120km 3DEnVar (scenarios/3denvar_OIE120km_WarmStart_VarBC.yaml), which failed at the forecast step after 3 cycles. The failure is caused by exceeding wall-time (so forecast job was killed).
With a closer look into the job setting, I found it still set to use 36 cores/node by default (though I believe this is not the cause of the failure). So changed 36 cores/node to 128 cores/node. Now 120km forecast step uses 1 node and 128 cores/node. After this change, restarting the cycling ran through 2-day cycles from 2018041418 to 2018041700.
Also modified some files for memory use from old 45GB for cheyenne to new 235GB for derecho. I believe we can remove those memory setting as all derecho nodes have the same 235GB, not like cheyenne having 45GB and 109GB nodes. But I will leave that for future PRs. And I do not change the memory setting for ensemble-related files, which can be done in a future PR. This memory change should have fixed a test failure of ~test/testinput/3denvar_O30kmIE60km_WarmStart.yaml.