How should we ensure that the source state for each water tracer is set correctly?

How should we ensure that the source state for each water tracer is set correctly?

The water tracer consistency check will be very helpful in catching certain problems with water tracers / isotopes as the code evolves over time. Specifically, it can catch the situation when a new flux is added but isn't updated at all. However, I realized that one (probably common) bug that it can't catch is: If we use the current ratio in state variable X to set a given tracer flux based on the bulk flux, but in fact this flux comes from state variable Y. This type of bug could be introduced as we first write the tracer-handling code (or when a new flux variable is added to the model), or it could be introduced later – for example, if a new state variable is introduced and now a certain flux originates from that new state variable, but the person making this change forgot to update the tracer-handling code accordingly. The water tracer consistency test would still pass in this case: since all state variables have the same tracer ratio in this test, the test can't catch problems with specifying the wrong source state.

Off-hand, I can think of three ways to reduce the likelihood of this bug (along with a 0th, do-nothing solution):

Do nothing about this. We may decide that all of the possible solutions are either infeasible or add too much complexity, and so we should just accept the risk that bugs of this variety will be introduced in the the tracer-handling code in order to keep code complexity lower (since this complexity would likely apply to the bulk as well as to the tracer code, and so would affect all scientists working with the code).
For each flux variable in the model, introduce a corresponding pointer that points to the source state variable for that flux. These pointers would be set up in initialization (e.g., in initialization, we would set bar_flux_source_state => waterstate_inst%foo_state). These pointers would then be used in two places:

(a) To pull the flux out of the source state: code like foo_state = foo_state - bar_flux*dt would be replaced by bar_flux_source_state = bar_flux_source_state - bar_flux*dt

(b) To set the source state for the tracer flux updates

Arguments for this are:
- I really like the confidence that this would give me that (a) and (b) remain in sync.
- We might want something like this eventually anyway, to facilitate separating the physics from the numerics. I'd like to hear thoughts from @martynpclark or others on how we will might eventually want the code to look for other purposes in terms of encoding the connection between a flux variable and its source (and possibly destination) state variable.
Arguments against this are:
- I'm a bit concerned that the change in (a) would increase the learning curve for understanding the code.
- Some flux variables may not lend themselves to this uniform treatment – i.e., that each flux has a single source. This could particularly be true if a flux variable pulls from source X in some cases and source Y in other cases. However, there are probably benefits from reworking the code so that those are two different flux variables anyway.
Put code for tracer updates near the code for removal of a flux from a state. i.e., the code to update the foo_flux tracer flux would be right next to the code for removing foo_flux from its source state. This wouldn't be as robust as (1), but should make it more likely that someone changing the source state for a flux would also remember to update the tracer-update code. Downsides are that this might not be how we'd ideally want to organize the code, and this won't be as clean when there are multiple fluxes drawing from the same state.
Add a system test that checks this somehow – similar to the existing tracer consistency test, but somehow ensuring that we have the right source state for each flux. I haven't been able to think of a way this could be done, but @davidnoone or someone else has some ideas here?

I'm not sure what's best at this point. I don't really like relying on (2). If there's a way to do (3) (and especially if it can be done without introducing too much complexity into the code), that might be ideal. Otherwise, (1) seems like it could be a good option, but I'd like to hear thoughts from others about whether the benefits outweigh the extra complexity.

An example might help illustrate what I'm thinking about here. We don't yet have any examples in the code for water isotopes, so here's an example for carbon isotopes (on which we're basing the water isotope implementation):

Here's the code that sets the isotope flux for leafc_to_litter_patch:

https://github.com/ESCOMP/ctsm/blob/a0ca27c1f67c2d544cfc490a23986cf35247d393/src/biogeochem/CNCIsoFluxMod.F90#L129-L132

Notice that the source state, leafc_patch, is hard-coded in that line; I'm pretty sure that this has to agree with the following for things to work correctly:

https://github.com/ESCOMP/ctsm/blob/a0ca27c1f67c2d544cfc490a23986cf35247d393/src/biogeochem/CNCStateUpdate1Mod.F90#L276

So if the line in CNCStateUpdate1Mod were changed to use a different source state without a corresponding change in CNCIsoFluxMod, the isotope version of leafc_to_litter_patch would be incorrect.

From today's CTSM discussion:

People agreed that this could truly be an issue.

Dave thinks that the best option could be coming up with a system test for this, if we can think of one.

Dave also thought that organizing the code so that tracer updates happen alongside their respective state updates might be a good option, though also agrees that it could muddy the code for someone trying to understand just the state updates and not caring about tracer stuff.

We thought of a 4th option:

Renaming fluxes to be explicit about their source (and maybe destination), as is done for the biogeochem code.

This wouldn't guarantee that things are done right, but it should make it more obvious when things are done wrong, and it should prevent someone from changing the state/flux structure, but reusing a flux variable to have a different source state. People felt that this might actually be the best solution. Nobody could think of a case where it fundamentally would NOT work to have a single source for a given flux - though there may be cases like two-way fluxes (e.g., for Richards equation) where we need to split what's currently one two-way flux into two one-way fluxes. And Dave thought this could help clarify the code, and so might be good for other reasons anyway.

Martyn: Long-term, we might want something like a data structure like:

eqn%var(ixLookState1)%state eqn%var(ixLookState1)%flux(:)

Then you can loop over this to do the state updates.

Bill: will think about moving incrementally towards Martyn's idea; for now this would involve something like option (1) above - having some extra pointers. The upside is that it would force code to remain self-consistent; the downside is that it could make the code harder to understand.

Overall conclusions: We want to continue to think about this. Can think more about a possible system test. Also think more about variable renames to support this: Bill could work with @swensosc on that.

Thinking more about option (2):

Put code for tracer updates near the code for removal of a flux from a state. i.e., the code to update the foo_flux tracer flux would be right next to the code for removing foo_flux from its source state. This wouldn't be as robust as (1), but should make it more likely that someone changing the source state for a flux would also remember to update the tracer-update code. Downsides are that this might not be how we'd ideally want to organize the code, and this won't be as clean when there are multiple fluxes drawing from the same state.

I don't think this will work, at least not well: We need to first have a loop over tracers that computes all the tracer fluxes, then have a loop over bulk and all tracers that updates states. I think this would be awkward, or likely impossible, if we need to keep the two side-by-side.

My current preferences are for (3) (a system test, if we can come up with one) or (4) (variable renames, since it sounds like that could be wanted anyway, and would help a lot with this problem).

I was thinking a little more about what option (3) could look like:

(3) Add a system test that checks this somehow

One option is: We could initialize each state with different tracer ratios. Then, if a particular namelist flag (which is labeled clearly as for-testing-only!) is set: In all water state updates, we would ignore incoming fluxes, only keeping the outgoing fluxes. Then I think each state should maintain its initial ratio, and this would be violated if any of the sinks from that state used the wrong state in the tracer flux calculation.

Problems with this are:

I can see how this would work for fluxes that have a source state, but it's less clear to me how this would work for fluxes that are portions of other fluxes (e.g., some portion of the incoming rain flux). That's okay, I guess: at least it would provide some checking.
This would require complicating the production code with some test-specific code.
(Update 2021-10-08) It seems like you can't run for very long this way before states start getting reduced to 0 mass. But I suppose we could just run the test for one time step.

(Another option would be to put some checking code in the state update routines that check fluxes as they come out of the state: confirm that the flux's ratio matches the state's. This would require the bulk state / flux being available in the state updates of the tracer states. But I think this is messier than the above idea.)

Moving to a discussion.

ESCOMP / CTSM

How should we ensure that the source state for each water tracer is set correctly? #511