stcorp / harp

Data harmonization toolset for scientific earth observation data
http://stcorp.github.io/harp/doc/html/index.html
BSD 3-Clause "New" or "Revised" License
55 stars 18 forks source link

Variable naming: covariance and random vs. systematic #14

Closed svniemeijer closed 8 years ago

svniemeijer commented 8 years ago

Covariance information is directly linked to uncertainties. The diagonal of the covariance provides variances and the square root of those are the standard deviations that should match the uncertainty values.

The off-diagonal parts of the covariance matrix actually indicate systematic aspects of the uncertainty (as they capture the (auto-)correlation of the measurements in the dimensions that are included in the covariance matrix).

HARP currently only uses covariance variables using a vertical dimension. This means that the off-diagonal parts are thus the systematic effects in the vertical dimension.

Systematic effects in the horizontal/time dimension are currently captured by a random/systematic split.

This has the weird situation that there are currently occurrences of random and systematic covariance variables (cov_random and cov_systematic). Where the random covariance is actually only 'random' for the horizontal/time dimension, but still inherently systematic for the vertical dimension.

This different treatment of random vs. systematic between vertical and horizontal/time dimensions is problematic. It requires that algorithms use a different approach to uncertainty propagation for vertical regridding/averaging compared to horizontal/temporal regridding/averaging.

It is an open point whether this aspect can be improved and whether handling of uncertainty correlation can be harmonized for the different dimensions.

This issue is related to #13.

svniemeijer commented 8 years ago

The concept of systematic and random covariances came from a GAIA-CLIM presentation on Measurement Uncertainty.

That presentation also mentions that the systematic covariance is not a real covariance matrix (but just some mathematical representation convenience). It also seems to be a matrix that can just be derived from the systematic uncertainty vector.

Based on this, the approach in HARP will be as follows:

This approach then eliminates the need to distinguish random and systematic for covariances as were encountered in NDACC GEOMS data.