Open junwang-noaa opened 3 months ago
Denise and I looked into the GEFS EP5 case you provided. Since the case does not reproduce only when cplwav2atm=.true., we test the case with ATMW and with code from the latest develop branch for debugging. It is found that a point (313, 113) on tile 3 has different bottom temperature after the first integration step.
893: mype= 893 in setup_export= 258.991302490234 i,j= 313 113
893: mype= 893 in setup_export= 259.155151367188 i,j= 313 113
vs
589: mype= 589 in setup_export= 258.991302490234 i,j= 313 113
589: mype= 589 in setup_export= 259.150024414062 i,j= 313 113
Further test showed that the z0 from wave model has a bad value, due to the decomposition, the test with decomposition 16x16 have the z0 updated in the fv3atm, but does not in the 16x24 decomposition test.
893: in assign_import,n= 17 found= T
893: in assign_import,n= 17 datar8(isc,jsc)= -101947800.000000
vs
589: in assign_import,n= 17 found= T
589: in assign_import,n= 17 datar8(isc,jsc)= 0.000000000000000E+000
589: in assign_import,n= 17 T cplwav2atm= T findex= 17
589: in assign zorlwav= -999.000000000000 ix= 1 nb= 13
589: zorlw= 0.317000000000000 lon= 137.359066603078 lat=
589: 55.6649708379619 tbom= 258.991302490234
The z0 value of "-101947800.000000" from restart.ww3 is not correct. @bingfu-NOAA, may I ask how the test is set up and where the restart.ww3 comes from? Also would you please provide a test from the latest develop branch? The case failed when we tried to run with the latest develop branch. Thanks.
@junwang-noaa can you show the location of the file and your rundir?
On hera: /scratch1/NCEPDEV/stmp2/Jun.Wang/ep5/gefscase.atmw/atmwav.rundir16x24 /scratch1/NCEPDEV/stmp2/Jun.Wang/ep5/gefscase.atmw/atmwav.rundir
I will show you the case on wcoss2 when the switch is done.
@NeilBarton-NOAA FYI.
@junwang-noaa @NeilBarton-NOAA @JessicaMeixner-NOAA Just an update: I can reproduce 16x16 from 16x24 ATM layout using HR3 tag and replay ICs.
That's great! So far we found that in the EP5 case you gave to us, after remove the restart.ww3, MOM6 produces different results at fh=1hr. It's not clear what caused that. @bingfu-NOAA would you please share the run directory so that we can continue check the scalability of EP5? @GeorgeVandenberghe-NOAA FYI.
I saved the rundir on Dogwood here: /lfs/h2/emc/gefstemp/Bing.Fu/ep5rep but some files inside the rundir should be soft link.
Denise and I looked into the GEFS EP5 case you provided. Since the case does not reproduce only when cplwav2atm=.true., we test the case with ATMW and with code from the latest develop branch for debugging. It is found that a point (313, 113) on tile 3 has different bottom temperature after the first integration step.
893: mype= 893 in setup_export= 258.991302490234 i,j= 313 113 893: mype= 893 in setup_export= 259.155151367188 i,j= 313 113 vs 589: mype= 589 in setup_export= 258.991302490234 i,j= 313 113 589: mype= 589 in setup_export= 259.150024414062 i,j= 313 113
Further test showed that the z0 from wave model has a bad value, due to the decomposition, the test with decomposition 16x16 have the z0 updated in the fv3atm, but does not in the 16x24 decomposition test.
893: in assign_import,n= 17 found= T 893: in assign_import,n= 17 datar8(isc,jsc)= -101947800.000000 vs 589: in assign_import,n= 17 found= T 589: in assign_import,n= 17 datar8(isc,jsc)= 0.000000000000000E+000 589: in assign_import,n= 17 T cplwav2atm= T findex= 17 589: in assign zorlwav= -999.000000000000 ix= 1 nb= 13 589: zorlw= 0.317000000000000 lon= 137.359066603078 lat= 589: 55.6649708379619 tbom= 258.991302490234
The z0 value of "-101947800.000000" from restart.ww3 is not correct. @bingfu-NOAA, may I ask how the test is set up and where the restart.ww3 comes from? Also would you please provide a test from the latest develop branch? The case failed when we tried to run with the latest develop branch. Thanks.
A second issue which came up in testing is that the elementMask values in the mesh file used by the wave mode has invalid values except on the first 1440 values. This corresponds to the first j=1 row. All other values are negative large integers.
Description
George V. found that the GEFS EP5 test case does not reproduce with different number of ATM MPI tasks when he was testing scalability of GEFS. Further investigation showed that the test reproduces with different ATM tasks for atm-only, and the S2S configurations, but does not for S2SW when both cplwav and cplwav2atm are set to .true.
To Reproduce:
Run EP5 test and change atm layout from (16,16) to (16,24) and compare the atmf or sfcf files.
Additional context
Output