Open TeaganKing opened 4 months ago
From meeting with @ekluzek
A note on somethings we need to do before this is asked for merge:
Agreed in today's ctsm software meeting: @TeaganKing will notify @slevis-lmwg when he should run the aux_cdeps test-suite.
Per conversation with Erik, we can remove the files listed in PLUMBER2 user mod directories because these will be implemented in another PR (#277 ); those do not need to be moved to CDEPS.
However, we do need to implement the dtlimit used for these various streams specifically for PLUMBER-- hence the placeholder values that I still need to ensure work properly for changing dtlimit when CLM_USRDAT_NAME is set to PLUMBER.
Variables in those user mod directories that are duplicated in the CDEPS stream can be removed once this PR is merged in.
The dtlimit
is updated as expected when running CTSM when I do an xmlchange
to set CLM_USRDAT_NAME
to PLUMBER
. So, @slevis-lmwg , I think we can run the aux_cdeps test-suite. Note that the CTSM changes (https://github.com/ESCOMP/CTSM/pull/2485 and https://github.com/ESCOMP/CTSM/pull/2406) are not yet available (since they're dependent on this CDEPS PR).
This PR introduced CLM_USRDAT_NAME as PLUMBER2 instead of PLUMBER, so I will update that now.
@TeaganKing I want to confirm that I understand. I need to combine the branches from these three PRs: https://github.com/ESCOMP/CTSM/pull/2485 https://github.com/ESCOMP/CTSM/pull/2406
before I start the aux_cdeps test-suite, right?
Also, a note to myself: The checklist points out that I need to generate a baseline.
@TeaganKing I want to confirm that I understand. I need to combine the branches from these three PRs: ESCOMP/CTSM#2485 ESCOMP/CTSM#2406 #262 before I start the aux_cdeps test-suite, right?
Also, a note to myself: The checklist points out that I need to generate a baseline.
ESCOMP/CTSM#2406 is very much still in progress, and there are going to be a few changes to ESCOMP/CTSM#2485 still as well (I'll do this within the next few days). What exactly is being tested with the aux_cdeps tests? I personally tested this one just by doing an xmlchange to set CLM_USRDAT_NAME to PLUMBER2, building the case, and checking the input files.
ESCOMP/CTSM#2406 is very much still in progress, and there are going to be a few changes to ESCOMP/CTSM#2485 still as well (I'll do this within the next few days). What exactly is being tested with the aux_cdeps tests? I personally tested this one just by doing an xmlchange to set CLM_USRDAT_NAME to PLUMBER2, building the case, and checking the input files.
Ok, based on this information, I think I could go ahead and submit aux_cdeps with #262 with ctsm from master (I will try ctsm5.2.007 which is the current latest).
I tried and failed to generate a baseline using the latest ctsm paired with cdeps1.0.38, i.e. the same cdeps that I see in @TeaganKing's branch:
./run_sys_tests -s aux_cdeps --skip-compare -g cdeps1.0.38_ctsm5.2.008
I also tried and failed to generate a baseline using the latest ctsm paired with cdeps1.0.34, i.e. the default cdeps for ctsm5.2.008:
./run_sys_tests -s aux_cdeps --skip-compare -g cdeps1.0.34_ctsm5.2.008
The former seems less surprising, if e.g. there are incompatibilities between ctsm5.2.008 and cdeps1.0.38.
The latter though means that I have a problem with aux_cdeps (environment or other?) or that aux_cdeps has a problem (in which case it should fail for others, as well).
@TeaganKing at this point I will need help from @ekluzek with this. I will raise the issue at Monday's stand-up.
I encountered the same problem this morning even with aux_clm and ctsm_sci. This helped me realize that the problem may be as simple as setting an account number that hasn't expired. I will try this again today or tomorrow.
UPDATE 1: I submitted the same two tests. I expect that at least the cdeps1.0.34 should work and generate a baseline.
UPDATE 2: Worked out the opposite from what I expected:
UPDATE 3:
Submitted aux_cdeps comparing this branch to the baseline (tests_0703-140019de).
./run_sys_tests -s aux_cdeps --skip-generate -c cdeps1.0.38_ctsm5.2.008
@TeaganKing two updates: 1) Erik clarified that the second checkbox (currently unchecked) is asking you to run one or more plumber cases to confirm that they work. 2) aux_cdeps fails for several tests with this error during the build phase:
2024-07-03 14:01:30: Test 'SMS_Ld5.f10_f10_mg37.2000_DATM%NLDAS2_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel' failed in phase 'SETUP' with exception 'ERROR: Fatal error in case.cmpgen_namelists: 2024-07-03 14:01:29 atm
Create namelist for component datm
Calling /glade/work/slevis/git_externals/plumber_upd_pr262b/components/cdeps/datm/cime_config/buildnml
Running /glade/work/slevis/git_externals/plumber_upd_pr262b/components/cdeps/datm/cime_config/buildnml
Traceback (most recent call last):
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/components/cdeps/datm/cime_config/buildnml", line 336, in <module>
_main_func()
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/components/cdeps/datm/cime_config/buildnml", line 332, in _main_func
buildnml(case, caseroot, "datm")
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/components/cdeps/datm/cime_config/buildnml", line 311, in buildnml
_create_namelists(case, confdir, inst_string, namelist_infile, nmlgen, data_list_path)
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/components/cdeps/datm/cime_config/buildnml", line 211, in _create_namelists
streams = StreamCDEPS(stream_file, schema_file)
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/components/cdeps/datm/cime_config/../../cime_config/stream_cdeps.py", line 65, in __init__
GenericXML.__init__(self, infile, schema)
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/cime/CIME/XML/generic_xml.py", line 78, in __init__
self.read(infile, schema)
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/cime/CIME/XML/generic_xml.py", line 129, in read
self.read_fd(fd)
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/cime/CIME/XML/generic_xml.py", line 159, in read_fd
self.tree = ET.parse(fd)
File "/glade/work/slevis/conda-envs/ctsm_pylib/lib/python3.7/xml/etree/ElementTree.py", line 1197, in parse
tree.parse(source, parser)
File "/glade/work/slevis/conda-envs/ctsm_pylib/lib/python3.7/xml/etree/ElementTree.py", line 598, in parse
self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 4083, column 15
ERROR: /glade/work/slevis/git_externals/plumber_upd_pr262b/components/cdeps/datm/cime_config/buildnml /glade/derecho/scratch/slevis/tests_0703-140019de/SMS_Ld5.f10_f10_mg37.2000_DATM%NLDAS2_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel.C.0703-140019de_int FAILED, see above'
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/cime/CIME/test_scheduler.py", line 1125, in _run_catch_exceptions
return run(test)
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/cime/CIME/test_scheduler.py", line 1016, in _setup_phase
"Fatal error in case.cmpgen_namelists: {}".format(output),
File "/glade/work/slevis/git_externals/plumber_upd_pr262b/cime/CIME/utils.py", line 176, in expect
raise exc_type(msg)
In case it helps, here's a list of tests that PASS versus FAIL:
PASS SMS_Ld2.ww3a.2000_SATM_SLND_SICE_SOCN_SROF_SGLC_DWAV%CLIMO.derecho_intel RUN
PASS SMS_Ld3.f09_f09_mg17.1850_SATM_DLND%SCPL_SICE_SOCN_SROF_SGLC_SWAV.derecho_intel RUN
PASS SMS_Ly3.f10_f10_ais8gris4_mg37.2000_SATM_SLND_SICE_SGLC_SROF_DGLC%NOEVOLVE_SWAV.derecho_intel RUN
PASS SMS_Ly3.f10_f10_ais8_mg37.2000_SATM_SLND_SICE_SGLC_SROF_DGLC%NOEVOLVE_SWAV.derecho_intel RUN
PASS SMS_Ly3.f19_g17_gris4.2000_SATM_SLND_SICE_SGLC_SROF_DGLC%NOEVOLVE_SWAV.derecho_intel RUN
As far as I can tell, the PEND failures report the same error as the FAIL in this list:
FAIL SMS_Ld5.f10_f10_mg37.1850_DATM%GSWP3v1_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.f10_f10_mg37.2000_DATM%CRUv7_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.f10_f10_mg37.2000_DATM%NLDAS2_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.f10_f10_mg37.2000_DATM%QIA_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.f10_f10_mg37.2010_DATM%GSWP3v1_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.f10_f10_mg37.HIST_DATM%GSWP3v1_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.f10_f10_mg37.SSP585_DATM%GSWP3v1_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5_P1.1x1_mexicocityMEX.2000_DATM%1PT_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel.datm-1PT SHAREDLIB_BUILD
PEND SMS_Ld5.T62_g17.2000_DATM%IAF_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.T62_g17.2000_DATM%NYF_SLND_DICE%IAF_DOCN%DOM_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.T62_g17.2000_DATM%NYF_SLND_DICE%SSMI_DOCN%DOM_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.T62_g17.2000_DATM%NYF_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.TL319_t061.2000_DATM%JRA-1p4-2018_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ld5.TL319_t061.2000_DATM%JRA_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ln5.f19_f19_mg17.2000_DATM%QIA_SLND_SICE_DOCN%DOM_SROF_SGLC_SWAV.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ln5.f19_f19_mg17.2000_DATM%QIA_SLND_SICE_DOCN%SOMAQP_SROF_SGLC_SWAV.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ln5.f19_f19_mg17.HIST_DATM%QIA_SLND_SICE_DOCN%DOM_SROF_SGLC_SWAV.derecho_intel SHAREDLIB_BUILD
PEND SMS_Ln9_P1.T42_T42.2000_DATM%QIA_SLND_SICE_DOCN%DOM_SROF_SGLC_SWAV.derecho_intel.datm-scam SHAREDLIB_BUILD
My quick look at the above lists suggests that SATM tests PASS and DATM tests fail.
Thank you for running these tests and clarifying the 2nd checkbox item!
Regarding actually running the PLUMBER case, we don't have run_tower()
fully functioning at the moment. I was thinking it may be most helpful to move this in and then finalize run_tower()
since it will require these changes?
@TeaganKing if your suggestion does not affect whether the aux_cdeps test-suite can be fixed in this PR (which I suspect and hope is true), then I would be fine with that. Still, I would like @ekluzek to also weigh in on your suggestion.
This will address #248 In order to implement PLUMBER capabilities