Open tangq opened 2 years ago
I see this is on chrysalis. Do we know if there is a testname that can reproduce this RRM? I don't see one in cime/config/e3sm/tests.py
@ndkeen , good question - I am not sure if the NA RRM configurations used in the E3SMv2 production runs are tested routinely or not.
northamericax4v1pg2_WC14to60E2r3.WCYCL1850.*.allactive-wcprodrrm is in the prod test suite.
Is the AMIP NA RRM tested by northamericax4v1pg2_WC14to60E2r3.WCYCL1850.*.allactive-wcprodrrm? If not, we will need to add the AMIP test.
I reproduced the pre-defined "L" layout, which was used for the production run, with the following xmlchange commands.
./xmlchange COST_PES=3840
./xmlchange NTASKS_ATM=3840
./xmlchange NTASKS_CPL=3840
./xmlchange NTASKS_OCN=3840
./xmlchange NTASKS_WAV=1
./xmlchange NTASKS_GLC=1
./xmlchange NTASKS_ICE=3840
./xmlchange NTASKS_ROF=3840
./xmlchange NTASKS_LND=3840
I ran SMS.northamericax4v1pg2_WC14to60E2r3.WCYCL1850.chrysalis_intel.allactive-wcprodrrm
which completed 5 days and would be used for 'M' size layouts. It uses 80 nodes total. I do not know if this test is similar enough to what is failing for you.
The test for this configuration should be something like SMS.northamericax4v1pg2_F20TR.*.
We also have conusx4v1_r05_oECv3.F2010 in the integration test suite.
What is the difference between conusx4v1 and northamericax4v1 ?
conusx4x1 is for E3SMv1, whereas northamericax4x1 is for E3SMv2.
We will need a test for northamericax4v1pg2_WC14to60E2r3.F20TR (if doesn't exist), which is the configuration used in the v2 AMIP production runs.
Found the problem. The CIME update included splitting the pe layout files to components but the EAM component layouts weren't updated. PR https://github.com/E3SM-Project/E3SM/pull/4928 needs to be added to maint-2.0.
That makes sense and it highlights the importance of testing production configurations. If northamericax4v1pg2_WC14to60E2r3.F20TR was in the test suit, we would have caught it when merging PR #4928 .
We need to start the v2 NARRM AMIP runs with high-frequency output very soon for the RRM overview paper. When testing it on chrysalis, the pre-defined M and L layouts got errors:
Model mpassi missing file graph64 = '/lcrc/group/e3sm/data/inputdata/ice/mpas-cice/WC14to60E2r3/mpas-seaice.graph.info.200714.part.64'
The v2 NARRM AMIP runs with standard output were tested successfully with both M and L layouts on chrysalis.
Failed tests: /lcrc/group/e3sm/ac.qtang/E3SMv2/old/v2.NARRM.amip_0101_bonus/tests Successful tests: /lcrc/group/e3sm/ac.qtang/E3SMv2/v2.NARRM.amip_0101/tests
Differing env_mach_pes.xml shows differences below: