Open samsrabin opened 1 month ago
Note that the cs.status
output will make more sense—i.e., SETUP will be marked as FAIL—once we bring in cime6.1.27 or later; see https://github.com/ESMCI/cime/pull/4681.
The file already exists here (78pft)
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/surfdata_esmf/NEON/ctsm5.3.0/surfdata_1x1_NEON_MOAB_hist_2000_78pfts_c240912.nc
and here (16pft)
.../16PFT_mixed/surfdata_1x1_NEON_MOAB_hist_2000_16pfts_c240912.nc
So I guess something just needs to be changed in the XML for the test to pick that up?
@samsrabin this sounds simple, although @olyson and I looked at this for a few minutes this morning and found: 1) The fsurdat setting seems correct in namelist_defaults_ctsm.xml 2) Other neon tests work suggesting that this test does something different that causes it to break...
Additional info. This test works:
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.derecho_gnu.clm-NEON-MOAB
Ah, so it seems like the addition of the PRISM
testmod is the issue.
UPDATE
I reverted the order of /testmods in the test like this:
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.derecho_intel.clm-PRISM--clm-NEON-MOAB
and the test passed. I will follow up with a test to confirm that I get same answers relative to the original test:
Running the two tests from ctsm5.2.028, i.e. the last tag when the original test passed: Diffs in the lnd_in files suggest that we may see diffs in answers. The runs fail on izumi because they think that cesm.exe does not exist, which it does. If this problem persists, I will repeat these two tests on derecho.
The new test works but gives diff answers in ctsm5.2.028 (last tag when the default test still worked) due to diff lnd_in (new test versus default test)
28,29c28,32
< hist_fincl2 = 'AR', 'ELAI', 'FCEV', 'FCTR', 'FGEV', 'FIRA', 'FSA', 'FSH', 'GPP', 'H2OSOI', 'HR', 'SNOW_DEPTH',
< 'TBOT', 'TSOI', 'SOILC_vr', 'FV', 'NET_NMIN_vr'
---
> hist_fincl2 = 'TG', 'TBOT', 'FIRE', 'FIRA', 'FLDS', 'FSDS', 'FSR', 'FSA', 'FGEV', 'FSH', 'FGR',
> 'TSOI', 'ERRSOI', 'SABV', 'SABG', 'FSDSVD', 'FSDSND', 'FSDSVI', 'FSDSNI', 'FSRVD', 'FSRND', 'FSRVI',
> 'FSRNI', 'TSA', 'FCTR', 'FCEV', 'QBOT', 'RH2M', 'H2OSOI', 'H2OSNO', 'SOILLIQ', 'SOILICE', 'TSA_U',
> 'TSA_R', 'TREFMNAV_U', 'TREFMNAV_R', 'TREFMXAV_U', 'TREFMXAV_R', 'TG_U', 'TG_R', 'RH2M_U', 'RH2M_R', 'QRUNOFF_U', 'QRUNOFF_R',
> 'SoilAlpha_U', 'SWup', 'LWup', 'URBAN_AC', 'URBAN_HEAT'
114,115c117,118
< stream_fldfilename_lightng = '/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/NASA_LIS/clmforc.Li_2016_climo1995-2013.360x720.lnfm_Total_NEONarea_c210625.nc'
< stream_meshfile_lightng = '/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/NASA_LIS/ESMF_MESH.Li_2016.360x720.NEONarea_cdf5_c221104.nc'
---
> stream_fldfilename_lightng = '/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/NASA_LIS/clmforc.Li_2016_climo1995-2013.360x720.lnfm_Total_c160825.nc'
> stream_meshfile_lightng = '/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/NASA_LIS/clmforc.Li_2016_climo1995-2013.360x720_ESMFmesh_cdf5_150621.nc'
Next I want to look at code diffs between ctsm5.2.029 and ctsm5.2.028 in case I spot the root cause of the failure.
From the code diffs 029 vs. 028, I see three main areas to focus on:
/cime_config/usermods_dirs/NEON/defaults/user_nl_clm
<-- I HAVE A FEELING THIS IS IT (SEE diff BELOW) and RELATES TO #2752.CLMBuildNamelist.pm
: search NEONnamelist_defaults_ctsm.xml
: search NEON--- a/cime_config/usermods_dirs/NEON/defaults/user_nl_clm
+++ b/cime_config/usermods_dirs/NEON/defaults/user_nl_clm
@@ -18,9 +18,6 @@
! Set glc_do_dynglacier with GLC_TWO_WAY_COUPLING env variable
!----------------------------------------------------------------------------------
-flanduse_timeseries = ' ' ! This isn't needed for a non transient case, but will be once we start using transient compsets
-fsurdat = "$DIN_LOC_ROOT/lnd/clm2/surfdata_esmf/NEON/surfdata_1x1_NEON_${NEONSITE}_hist_2000_78pfts_c240206.nc"
-
! h1 output stream
Putting back the code shown in the last post fixes the test failure. But it also reverses an attempt to reduce code clutter. Is there an alternative solution? Is the /testmods order-reversal -- that I showed works -- an acceptable solution?
I think the root issue is that the NEON site defaults only apply if simulating 2018:
<!-- for NEON sites present day simulations - year 2000 -->
<fsurdat hgrid="CLM_USRDAT" neon=".true." sim_year="2018" use_fates=".true.">
lnd/clm2/surfdata_esmf/NEON/ctsm5.3.0/16PFT_mixed/surfdata_1x1_NEON_${NEONSITE}_hist_2000_16pfts_c240912.nc</fsurdat>
<fsurdat hgrid="CLM_USRDAT" neon=".true." sim_year="2018" use_fates=".false.">
lnd/clm2/surfdata_esmf/NEON/ctsm5.3.0/surfdata_1x1_NEON_${NEONSITE}_hist_2000_78pfts_c240912.nc</fsurdat>
Is there a reason for that?
Or another way of looking at it: The issue is that adding the PRISM testmod after the NEON one means that the NEON testmod's shell_commands
seemingly never get run. Otherwise, the date would be set to 2018.
But that raises another question: When you run with PRISM first, does the PRISM testmod's shell_commands
get run?
Never mind, that's not it. Both orderings result in the following output for ./xmlquery -p YR
:
Results in group run_component_datm
DATM_YR_ALIGN: 2018
DATM_YR_END: 2020
DATM_YR_START: 2018
DATM_YR_START_FILENAME: 9999
And the following for ./xmlquery --listall | grep 2018
:
CLM_NML_USE_CASE: 2018_control
DATM_YR_ALIGN: 2018
DATM_YR_START: 2018
But I have to say, I don't like not knowing why the order matters...
Found it! The problem is that CLMBuildNamelist.pm
doesn't set neon
to .true.
unless CLM_USRDAT_NAME
is NEON
. When the PRISM testmod comes second, CLM_USRDAT_NAME
is set to NEON.PRISM
. The following change fixes it:
--- a/bld/CLMBuildNamelist.pm
+++ b/bld/CLMBuildNamelist.pm
@@ -713,7 +713,7 @@ sub setup_cmdl_resolution {
$nl_flags->{'neon'} = ".false.";
$nl_flags->{'neonsite'} = "";
if ( $nl_flags->{'res'} eq "CLM_USRDAT" ) {
- if ( $opts->{'clm_usr_name'} eq "NEON" ) {
+ if ( $opts->{'clm_usr_name'} eq "NEON" || $opts->{'clm_usr_name'} eq "NEON.PRISM" ) {
$nl_flags->{'neon'} = ".true.";
$nl_flags->{'neonsite'} = $envxml_ref->{'NEONSITE'};
$log->verbose_message( "This is a NEON site with NEONSITE = " . $nl_flags->{'neonsite'} );
However, there's probably a better way to do this with Perl—e.g., instead of checking for exact matches, just check whether the name starts with NEON.
Yep, like so:
--- a/bld/CLMBuildNamelist.pm
+++ b/bld/CLMBuildNamelist.pm
@@ -678,6 +678,11 @@ sub setup_cmdl_chk_res {
}
}
+sub begins_with
+{
+ return substr($_[0], 0, length($_[1])) eq $_[1];
+}
+
sub setup_cmdl_resolution {
my ($opts, $nl_flags, $definition, $defaults, $envxml_ref) = @_;
@@ -713,7 +718,7 @@ sub setup_cmdl_resolution {
$nl_flags->{'neon'} = ".false.";
$nl_flags->{'neonsite'} = "";
if ( $nl_flags->{'res'} eq "CLM_USRDAT" ) {
- if ( $opts->{'clm_usr_name'} eq "NEON" ) {
+ if ( begins_with($opts->{'clm_usr_name'}, "NEON") ) {
$nl_flags->{'neon'} = ".true.";
$nl_flags->{'neonsite'} = $envxml_ref->{'NEONSITE'};
$log->verbose_message( "This is a NEON site with NEONSITE = " . $nl_flags->{'neonsite'} );
Thank you @samsrabin I'm testing your suggestion now.
./create_test SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.derecho_intel.clm-NEON-MOAB--clm-PRISM
worked on derecho, so I will open a PR with your suggested mods.
Originally posted by @slevis-lmwg in https://github.com/ESCOMP/CTSM/issues/2310#issuecomment-2372367434
I'm elevating this to its own issue because now the
cs.status
output is confusing, and the expected failure isn't detected.