ufs-community / ufs-weather-model

UFS Weather Model
Other
139 stars 247 forks source link

control_ugwpv1 fails when changing the decomp #742

Closed DeniseWorthen closed 3 years ago

DeniseWorthen commented 3 years ago

Description

Changing the decomp in the control_ugwpv1 test and running against the current baseline fails.

This was found while developing the the update to the low resolution coupled tests (which use ugwpv1,noahmp,nsst) and includes both cpld_decomp, cpld_2threads tests. The test version of cpld_decomp failed.

To determine the cause, I added a decomp change to the control_noahmp and the control_ugwpv1 test for the develop branch. The control_noahmp test passed with the decomp change, but the control_ugwpv1 test failed:

baseline dir = /scratch1/NCEPDEV/nems/emc.nemspara/RT/NEMSfv3gfs/develop-20210805/INTEL/control_ugwpv1
working dir  = /scratch1/NCEPDEV/stmp2/Denise.Worthen/FV3_RT/rt_15393/control_ugwpv1
Checking test 002 control_ugwpv1 results ....
 Comparing sfcf000.nc ............ALT CHECK......NOT OK
 Comparing sfcf024.nc ............ALT CHECK......NOT OK
 Comparing atmf000.nc ............ALT CHECK......NOT OK
 Comparing atmf024.nc ............ALT CHECK......NOT OK
 Comparing GFSFLX.GrbF00 .........NOT OK
 Comparing GFSFLX.GrbF24 .........NOT OK
 Comparing GFSPRS.GrbF00 .........NOT OK
 Comparing GFSPRS.GrbF24 .........NOT OK

  0: The total amount of wall time                        = 462.757400

Test 002 control_ugwpv1 FAIL

To Reproduce:

Check out the develop branch. Add the following decomp change to the control_ugwpv1 test and run the test against the current baseline.

diff --git a/tests/tests/control_ugwpv1 b/tests/tests/control_ugwpv1
index 773e5ae..764425d 100644
--- a/tests/tests/control_ugwpv1
+++ b/tests/tests/control_ugwpv1
@@ -33,6 +33,9 @@ export IAER=5111
 export WLCLK=30
 export DO_UGWP_V1=.T.

+export INPES=6
+export JNPES=4
+
 export FV3_RUN=control_run.IN
 export CCPP_SUITE=FV3_GFS_v16_ugwpv1
 export INPUT_NML=control_ugwpv1.nml.IN
climbfuji commented 3 years ago

@mdtoyNOAA FYI

junwang-noaa commented 3 years ago

@yangfanglin @mdtoyNOAA May I ask if you can take a look at this issue? Currently the decomposition test for P7 is broken, we need to fix this issue in order to maintain a working decomposition test for future feature added to P7.

mdtoyNOAA commented 3 years ago

I am looking into this issue. So far, I have noticed it’s a problem that was introduced last year even with earlier versions of the suite, i.e., “drag_suite” and “unified_ugwp”. Along with the new “ugwpv1_gsldrag” scheme, these passed regression tests and were accepted into the CCPP repo. @DomHeinzeller and @grantfirl Are the “decomp tests” new tests that were introduced recently?

On Aug 19, 2021, at 4:34 PM, Jun Wang @.***> wrote:

@yangfanglin https://github.com/yangfanglin @mdtoyNOAA https://github.com/mdtoyNOAA May I ask if you can take a look at this issue? Currently the decomposition test for P7 is broken, we need to fix this issue in order to maintain a working decomposition test for future feature added to P7.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/742#issuecomment-902296066, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARRVLIDV6TQKG4VG2NAHGGTT5WBIZANCNFSM5B376MKQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email.

junwang-noaa commented 3 years ago

@mdtoyNOAA When a new feature is introduced in UFS, a regression test for the new feature (e.g. control_ugwpv1) will be added to show that the model can run with this new feature. The new feature should also be tested for reproducibility for threading, decomposition, MPI tasks and restart, which is required for operational implementation. However several new features were added without doing these reproducibility tests. Since the feature ugwp is chosen to be included in P7 test for future operational implementation, we need to make sure the ugwp code changes do not break the reproducibility capability. So far we found the decomposition reproducibility broke in coupled P7c, we'd like to fix the issue, otherwise we can't maintain reproducibility for future new features that will be added to P7c. Hope this answers your question. Thank you!