Open adrifoster opened 2 weeks ago
Is this me misunderstanding how hist_fexcl1
should work? Is it meant only to exclude variables that are added by default? Or is the issue that the lnd_in
file isn't placing the testmod options after the included testmod?
I'm not sure, currently because of the includes
for this test, the variable is added to the hist_fincl1
:100: @adrifoster. My assumption was that the order of the test mod updates would impact how the lnd_in
file would be generated and how that would override history variable inclusion, but I'm realizing that's likely incorrect.
I'm also realizing that the error message was brought in as part of #2339. ~@rgknox I think we might need to add a simple check against the hist_fexcl
depending on what the expected behavior of the same variable being part of both hist_fincl
and hist_fexcl
:~ https://github.com/ESCOMP/CTSM/blob/26b9aa7bdd5b0c1807f0f9bc12036548329c47b8/src/utils/clmfates_interfaceMod.F90#L656-L665
UPDATE: see below
Tagging #2850 here to note that this came up trying to address this issue.
Note that commenting out the call to CrossRefHistoryFields
I run into this error:
2937 hist_htapes_build Initializing clm2 history files
2938 ------------------------------------------------------------
2939 htapes_fieldlist ERROR: FATES_ERROR_EL in fincl( 66 )
2940 for history tape 1 not found
2941 ENDRUN:
2942 ERROR: ERROR in histFileMod.F90 at line 885
Looking at the code, the issue for this specific test, IIUC, is that we're including FATES_ERROR_EL
in the base testmod, but the fates init_history_io
isn't even adding it to the allhistlfldlist
, appropriately, because of this new line in fates: https://github.com/NGEET/fates/blob/1be3962933351b94fef134d607ac7d821d884f96/main/FatesHistoryInterfaceMod.F90#L8465-L8483. So the failure mode noted in the original comment is the technically correct behavior. The issue to address that failure mode is really #2850.
That said, I've got a test going to check if the original issue title is still techincally correct and worth addressing.
I tried replicating a version of this with a fates run mode test mod that I know will include FATES_ERROR_EL
in the allhistfldlist
. I did this by temporarily adding hist_fexcl1 = 'FATES_ERROR_EL'
to the FatesCold
user_nl_clm file, which inherits the Fates
testmod user_nl_clm with that variable included. I then ran the FatesCold
testmod and it successfully ran, but it included FATES_ERROR_EL
.
So I guess I'm wondering if we should keep this issue open and address whether or not the behavior listed in the title is correct relative to this variant I just ran, or start a new issue/discussion?
@glemieux i'm not sure I understand what you are saying. I think what you did is in support of the issue listed here. or am I missing something?
Sorry. What I'm saying is that the failure mode in the first comment is the expected result of issue #2850 and what the title suggests is not truly at fault.
That said, the question the title suggest still stands given the results of the variant I noted here: https://github.com/ESCOMP/CTSM/issues/2867#issuecomment-2463277294. As such, I was thinking we should start a new discussion to ask what the expected behavior of setting hist_fexcl
and hist_fincl
for the same history output should be since the comments in this issue might confuse readers and would start the conversation off more cleanly.
Per discussion at the stand up meeting today, we decided to keep the issue open as we want the behavior for setting the same variable in hist_fexcl
and hist_fincl
to be different than failing at run time with the error message in the original post comment. Ideally this would either:
hist_fexcl
or hist_fincl
take precedent for the give variable (with a warning)For the second option, the documentation should probably be updated to explain to the user the expected behavior as it will be opaque to the user.
Brief summary of bug
If a variable is in both
hist_fexcl
andhist_fincl
the code seems to assumehist_fincl
. This can result in a crash if having the variable doesn't work for your code.General bug information
CTSM version you are using:
ctsm5.3.009
Does this bug cause significantly incorrect results in the model's science? No
Configurations affected: any
Details of bug
In trying to test
ERS_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-FatesColdST3
I got a fail because ST3 mode does not outputFATES_ERROR_EL
. This variable is in thehist_fexcl1
but it is also in thehist_fincl1
. The run submits but then fails when it reads the variable list:Important details of your setup / configuration so we can reproduce the bug