Open ekluzek opened 1 year ago
I also note that the order for the testmods FATES NEON tests is the opposite to the BGC NEON tests. This should be figured out...
Could one solution to this and #1949 be to use usermods in NEON/defaultshell_commands to key off of the compset being used and customize FATES namelist setting in? For example:
if [[ $compset =~ *Fates* ]]; then
echo "fsurdat ='$DIN_LOC_ROOT/lnd/clm2/surfdata_map/NEON/16PFT_mixed/surfdata_1x1_NEON_${NEONSITE}_hist_16pfts_Irrig_CMIP6_simyr2000_c230120.nc'" >> user_nl_clm
echo "hist_fincl2 = 'LIST','OF','FATES','HISTORY','FILES' " >> user_nl_clm
fi
@wwieder I like that suggestion for duplication. This issue here is pretty specific to how testmods work for us. And I probably need to file a specific issue with cime, but I need to investigate more.
Should this come in with #2044?
@ekluzek - I see the value in what you're saying: having a full list of all shell_commands for a test. This is more of a concern now that (1) we allow multiple testmods and (2) the shell_commands can be pretty complex (if they were simply a list of xmlchange commands, you could see what was done in the CaseStatus file).
I think you're going to run into trouble if you try to append to the shell_commands without removing it at the start of the application of each testmod (https://github.com/ESMCI/cime/blob/95e29f9736d2ef41b690f082afe8509caadecb08/CIME/user_mod_support.py#L29-L31), because you'll end up re-executing commands multiple times (which may often be okay, but in some cases may be problematic). My suggestion would be to introduce a new file like shell_commands_all_testmods into which you copy the contents of shell_commands after executing them (e.g., after this block: https://github.com/ESMCI/cime/blob/95e29f9736d2ef41b690f082afe8509caadecb08/CIME/user_mod_support.py#L118-L121)
Just a note: As part of #1863, I had bumped the CIME version from cime6.0.108
to cime6.0.118
. It turns out that the 5 tests like this—with multiple testmods listed—crash in the XML generation step under the latter. I'm not sure whether this is resolved in one of the subsequent CIME tags, but on a quick scan none of their commit messages suggest they'd have an effect.
Hmm. I just rolled back to cime6.0.108
and I'm still getting the crashes. This happens with my code as well as with ctsm5.1.dev130
.
Here's the crash log, from /scratch/cluster/samrabin/tests_0712-100716iz/SMS_L10d_D_Mmpi-serial.CLM_USRDAT.I1PtClm51Bgc.izumi_nag.clm-default--clm-NEON-NIWO.C.0712-100716iz
on Izumi:
2023-07-12 10:07:39: Test 'SMS_L10d_D_Mmpi-serial.CLM_USRDAT.I1PtClm51Bgc.izumi_nag.clm-default--clm-NEON-NIWO' failed in phase 'XML' with exception ''NoneType' object has no attribute 'group''
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/test_scheduler.py", line 1093, in _run_catch_exceptions
return run(test)
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/test_scheduler.py", line 885, in _xml_phase
opt = match.group(1)
---------------------------------------------------
2023-07-12 10:07:41: Test 'SMS_L10d_D_Mmpi-serial.CLM_USRDAT.I1PtClm51Bgc.izumi_nag.clm-default--clm-NEON-NIWO' failed in phase 'XML' with exception 'ERROR: locked: cannot make child group in file /scratch/cluster/samrabin/tests_0712-100716iz/SMS_L10d_D_Mmpi-serial.CLM_USRDAT.I1PtClm51Bgc.izumi_nag.clm-default--clm-NEON-NIWO.C.0712-100716iz/env_test.xml'
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/test_scheduler.py", line 1093, in _run_catch_exceptions
return run(test)
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/test_scheduler.py", line 790, in _xml_phase
envtest.add_elements_by_group(files, {}, "env_test.xml")
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/XML/entry_id.py", line 448, in add_elements_by_group
newgroup = self.make_child(name="group", attributes={"id": gname})
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/XML/generic_xml.py", line 311, in make_child
"read_only" if self.read_only else "locked", name, self.filename
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/utils.py", line 166, in expect
raise exc_type(msg)
---------------------------------------------------
2023-07-12 10:07:43: Test 'SMS_L10d_D_Mmpi-serial.CLM_USRDAT.I1PtClm51Bgc.izumi_nag.clm-default--clm-NEON-NIWO' failed in phase 'XML' with exception 'ERROR: locked: cannot make child group in file /scratch/cluster/samrabin/tests_0712-100716iz/SMS_L10d_D_Mmpi-serial.CLM_USRDAT.I1PtClm51Bgc.izumi_nag.clm-default--clm-NEON-NIWO.C.0712-100716iz/env_test.xml'
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/test_scheduler.py", line 1093, in _run_catch_exceptions
return run(test)
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/test_scheduler.py", line 790, in _xml_phase
envtest.add_elements_by_group(files, {}, "env_test.xml")
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/XML/entry_id.py", line 448, in add_elements_by_group
newgroup = self.make_child(name="group", attributes={"id": gname})
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/XML/generic_xml.py", line 311, in make_child
"read_only" if self.read_only else "locked", name, self.filename
File "/home/samrabin/ctsm_rx_crop_calendars4/cime/CIME/utils.py", line 166, in expect
raise exc_type(msg)
Never mind! Apparently @glemieux fixed this already.
Closing this as a wontfix. This would need to be addressed in cime.
@ekluzek This is an important issue, though. Could you post it in the CIME repo?
Brief summary of bug
The shell_commands file created for a test with two testmods such as...
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm51Fates.izumi_nag.clm-NEON-FATES-SJER--clm-FatesPRISM
seems to execute the first test-mod, but create the shell_commands for the testcase that just contains the second.
General bug information
CTSM version you are using: ctsm5.1.dev129
Does this bug cause significantly incorrect results in the model's science? No
Configurations affected: Tests in the test list that include two testmods
This would be an issue in cime, rather than in CTSM. And possibly this behavior is OK, but it makes it hard to see what the test is actually doing.