Stages correction needed in https://dart.ucar.edu/pages/Diagnostics.html

kdraeder commented 3 years ago

The page recommends using the "preassim" stage for evaluation of an assimilation. This is often not the best choice because it has the inflated fields in it, which are non-physical, and can't be compared with the "analysis" stage. The "forecast" stage is comparable to the analysis.

kdraeder commented 3 years ago

Another topic that often confuses me:

The Observations are the Key. ...the first thing to check is to see how many observations are getting rejected by the assimilation in addition to the RMSE and spread of the ensemble. A natural part of the DART framework is that these metrics are calculated automatically and are readily available in the obs_seq.final files.

My understanding is that only the number of obs used/rejected and the spread (and mean) are in the obs_seq.final files. The RMSE and other statistics are calculated in obs_diag.

jlaucar commented 3 years ago

Kevin raises a pretty complex issue here that we have definitely not discussed enough in the documentation.

The preassim stage is the one that is most straightforward to interpret in a DA context. It is the actual prior for the Bayesian update done by the data assimilation and it is the one that should have an appropriate spread/RMSE relationship. It is also the one that is consistent with the priors in the observation space diagnostics which are the forward operators applied to the inflated prior ensemble.

Kevin's comment about 'non-physical' is relevant, although primarily for a subset of the state. Obvious example is when the inflation leads to a quantity that is guaranteed to be positive definite by the model becoming negative.

The question of comparing to the analysis stage is also subtle. If one is using only prior inflation, as is most common for us, the assumption is that the posterior spread is mostly okay (or we just don't care). We could use both prior and posterior inflation and then things would be more comparable in some sense. As we know, I am pretty strongly opposed to validating posteriors, even in OSSEs, so I am not sure if I'm that worried about this.

Does Moha want to weigh in on this? I'm thinking that the use of the 'preassim' as the default is the best choice, but...

kdraeder commented 3 years ago

I think that my comment about "nonphysical" includes more than "unallowed values". Neighboring points may have different inflations that create "preassim" model states that are highly unbalanced. They are brought back into balance by the assimilation, so it's not an issue for the model.

I hadn't thought of comparing the inflated preassim with an inflated "output" stage. I'd be worried about comparing 2 states that seem to me to be distorted away from what the atmosphere/model looks like. I'm much more comfortable comparing the model output ("forecast"), which looks very much like the atmosphere, and the analysis ("output"), which looks even more like the atmosphere.

If we compare the (inflated) preassim and output stages, we'll be misled. If the assimilation does nothing, so that the states produced by the model are the same as those passed to the next hindcast, then innov = output - forecast = 0. But if we use the (inflated) preassim instead of forecast, then the difference is no longer 0, even though the fundamental model states are identical. At least in this limited application (checking to see what the assimilation did) we should use the forecast.

mgharamti commented 3 years ago

In the case of no prior inflation, the files from the preassim and forecast stages should be exactly the same. As far as I'm concerned, inflation is the only difference separating these two stages. Are there any differences?

If the answer is no, then I agree with Jeff the preassim stage files are the ones we should at. In principle, these are the priors that describe the prior distribution of the state before the update. Keep in mind, the ensemble mean will always be the same before and after inflation. This also applies to the RMSE. If the user wants to assess the individual ensemble members, he/she/they should be aware that forecast is pure model and preassim includes inflation. We can stress this point on the webpage if it's not clear. With that said, comparing the analysis and preassim means and spreads (to get an idea of the DA increment) is totally fine. Again, the only place where one should be careful is when looking at the individual ensemble members.

Remark: Just a clarification on comparing the RMSE with the prior spread. What we would like to match is the RMSE and the total spread (rather than only the spread of the ensemble)

kdraeder commented 3 years ago

Ah yes, I'd forgotten about the means being preserved. Thanks for that. But the spreads are not preserved by the inflation, by definition. And I don't see how the RMSE can be preserved. The bias might be.

jlaucar commented 3 years ago

Means are preserved and RMSE is just a function of the mean. The original 1999 paper was predicated on the idea that multiplicative inflation preserved means and correlations, but not covariances. That goes flying out the window as soon as there's any localization or if the inflation is 'spatially-varying'. The bounded quantity issue is still relevant here, however.

kdraeder commented 3 years ago

Here's a possible misunderstanding. Moha, when you wrote that the RMSE should be the same before and after inflation, were you referring to the RMSE of the state ensemble relative to a single observation, which is not preserved, or the obs space RMSE of all of the ensemble means in a region relative to their respective observations, which would be preserved, because the means are preserved?

timhoar commented 3 years ago

I was also confused by the statement that the RMSE is just a function of the mean. The 'squaring' part makes the actual values important.

forecast = [-3 -2 -1 0 1 2 3]; preassim = [-4 -3 -2 0 2 3 4]; [mean(forecast) mean(preassim)] ans = 0 0 [rms(forecast) rms(preassim)] ans = 2.0000 2.8785

jlaucar commented 3 years ago

The 'error' is the vector difference between the prior ensemble mean and the observation. The prior ensemble mean is unchanged by the inflation. Therefore, the error is unchanged by the inflation and so is the RMSE which is simply a function of the error.

timhoar commented 3 years ago

Got it ... that is consistent with the way obs_diag.f90 works since it is computing the stated quantities using the ensemble mean and observation values, not the individual values of the forward operators. Not sure this is clear for anyone creating their own post-processing routines (which means writing their own observation sequence file reader, and always putting out the full ensemble - something very few people actually do).

kdraeder commented 3 years ago

Another possible misunderstanding: Jeff's description "difference between the prior ensemble mean and the observation." implies to me that this discussion is now about obs space priors, not state space, which was where this issue arose, so that's where my brain was. And "vector difference" implies a group of prior somethings, which I'm guessing is the prior obs estimates of all the obs in a region (since we seem to be in obs space) rather than a single ob.

jlaucar commented 3 years ago

Could also be in state space. Error is the difference between the prior ensemble mean and the truth. Again, has nothing to do with the prior ensemble spread. Vector here referes to the fact that there are 1 or more prior quantities being evaluated. Those could be prior estimates of state variables, or any function of state variables (including estimates of observations). The 'M' in RMSE is computing a mean over the elements of the squared errors.

NCAR / DART

Stages correction needed in https://dart.ucar.edu/pages/Diagnostics.html #90