Expected adaptive inflation behavior

braczka commented 1 week ago

What's the issue?

The documentation regarding the expected behavior of adaptive inflation needs to be improved. This includes what the scientific recommendation is for the application of inflation in regards to either prognostic versus diagnostic variables within the DART state. The guidance should reflect that inflation is applied to only 'UPDATED' DART variables. In addition, more explicit guidance on how to apply inflation settings (i.e. prior, posterior, damping settings) is needed as well as more discussion on diagnosing the inflation based on its time-varying behavior. This is a continuation of a discussion from a previous issue on github #276, and also based on recent standup discussions held within DAReS.

Where did you find the issue?

The core inflation documentation is located here . This section also refers to the fill_inflation_restart documentation.

What needs to be fixed?

There are a number of issues, but to name of few, the introduction is outdated -- it refers to the outdated Manhattan naming system and also the deprecated 'observation' space inflation. The documentation should immediately address the purpose of inflation which is to handle both systematic biases (e.g. structural model errors) and sampling errors (e.g. low ensemble member count). It should directly state that inflation is applied to only 'UPDATED' variables in the DART state, whereas 'NO_COPY_BACK' variables are left alone (inflation = 1). It needs better guidance for evaluating proper behavior of time-varying inflation behavior. See next section for more details.

Suggestions for improvement

1) The documentation should state that inflation is only applied to 'UPDATED' variables in the DART state, and not 'NO_COPY_BACK' variables. 'NO_COPY_BACK' variables by definition have no impact on the assimilation forecast, however, are required for the calculation of the forward operator.

2) The section that refers to adaptive inflation recommendations based on Gharamti et al.,. 2019 should be expanded. For example, this could include the Lorenz 63 figure where various inflation approaches are used to address model bias and sampling error. (AI-b: prior inflation only; AI-a: posterior inflation only; AI-ab: both)

This is subjective, but could be then followed by a recommendation table based on assimilation applications:

3) The current documentation recommends to apply inflation in sequential steps: a) start without inflation, b) Try inflation flavor 2 and 5 with prior inflation only, c) add damping if necessary. Is this still accurate?? At least for CLM-DART applications my understanding is that inflation flavor 5 (inverse gamma) is recommended. Also prior inflation and damping should be applied as a default. The Gharamti et al.., 2019 figure can also be used for guidance.

4) The 'fill_inflation_restart' scripting is described, but the documentation makes it seem like this can be used as an alternative to the dynamic adaptive inflation. It never mentions 'fill_inflation_restart' it is commonly used within DART model shell scripting to run during the first timestep only to create a template (with inflation =1), then subsequently the adaptive inflation is generated internally.

5) The current documentation is very vague on how to diagnose proper inflation performance. It only states that inflation values should remain between 1-30. I think we could provide much more guidance for users to warn them if inflation is not behaving properly. For example, simple things like looking out for edge-hitting behavior for the mean inflation value, and also describe expected temporal patterns in inflation, for example, that it will covary with observation density (to counteract the collapse in ensemble spread). I feel like we could also give guidance if the inflation is not responding fast enough -- perhaps have user look at difference in ensemble RMSE and total_spread statistics to see if the ensemble is over/under dispersed.

6) Get rid of the 'ncap2' description of how to manually generation inflation files. I believe this same information is given in the fill_inflation_restart documentation, and is redundant.

Anything else we should know?

This is beyond the scope of this documentation issue, however, given the scientific recommendation is to not apply inflation to NO_COPY_BACK variables, it seems unnecessary to include these variables in the priorinf_mean and priorinf_sd files if they are just placeholders for fixed values of mean and sd inflation values with no application. There is an opportunity to make the code more efficient --- as long as their removal does not have detrimental side effects on functionality.

hkershaw-brown commented 1 week ago

As per the standup Nov 12th 2024 I'm asking for clarification on requirements for state vs non-state per ensemble member data on Brett's inflation documentation issue.

Couple of clarifying questions: NO_COPY_BACK. These variables are updated by the assimilation. Is this required behavior? i.e. probit transformed, counted as "close state", increments applied.

When using subroutine callable models, the inflation will propagate for NO_COPY_BACK variables. The inflation==1 will only be on the initial read of the file only. Please advice what the required behavior is for subroutine callable models.

When using subroutine callable models, the NO_COPY_BACK variables are updated and passed back to the model as input to "advance_state". Please advise what the required behavior is.

The three proceeding questions questions I believe are essentially the same question: Are the NO_COPY_BACK variables state (whether it is joint state or model state)? Or are they per-ensemble static data? Or are they something else.

Thank you for your time, looking forward to your response.

mgharamti commented 1 week ago

@braczka Thanks for adding this discussion. I think the points you raised are really good and I plan to enhance the documentation accordingly. I'm swamped this week and probably next week. But it's on my to-do list and I'll prioritize it.

@hkershaw-brown I thought I'd answer some of your questions now before I update the documentation:

The NO_COPY_BACK variables are diagnostic variables that are needed to compute the forward operator only. They should not be updated by the DA scheme and thus, they are not state variables. When we estimate parameters on top of the state, we extend the state vector to form what we call a joint state estimation. The NO_COPY_BACK variables are only needed to compute $h(x)$. Roughly speaking, they are embedded in $h$ (obs function) and not part of $x$ (state). Because the DART state $z = [x, h(x)]$, we update both $x$ and $h(x)$. One needs to distinguish between $h(x)$ and the NO_COPY_BACK variables.
The NO_COPY_BACK variables should not be inflated whether it's a subroutine callable model or an outside model.
For a subroutine callable model, the NO_COPY_BACK variables should not be updated before they are passed back to the model.

hkershaw-brown commented 1 week ago

Seconding Moha, great articulation of the issue and discussion - as aways @braczka @mgharamti Thanks for the answers, super helpful requirements.

jlaucar commented 3 days ago

Just wanted to note that Brett raised related issues in yesterday's stand-up. He suggested that when the obs_impact tool is used to limit the impact of an observation on state variables, that this does not correspondingly limit the impact of the observation on the inflation associated with the state variables. Scientific discussion with Brett, Moha, and me concluded that this would be incorrect. All methods that localize the impact of a variable on a state variable should similarly impact the impact on the inflation associated with that state variable. In addition to standard localization and obs_impact, sampling_error_correction should also be the same. Jeff

On Thu, Nov 14, 2024 at 6:49 AM Helen Kershaw @.***> wrote:

Seconding Moha, great articulation of the issue and discussion - as aways @braczka https://github.com/braczka @mgharamti https://github.com/mgharamti Thanks for the answers, super helpful requirements.

— Reply to this email directly, view it on GitHub https://github.com/NCAR/DART/issues/775#issuecomment-2476402861, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDHUIXU6UHJ6JP3QBF32S32ASS4XAVCNFSM6AAAAABRXEP45SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZWGQYDEOBWGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

braczka commented 3 days ago

Thanks @jlaucar. A new bug report will be issued by myself or @XueliHuo regarding this issue today. I will reference this inflation issue in the new bug report.

NCAR / DART