stcorp / harp

Data harmonization toolset for scientific earth observation data
http://stcorp.github.io/harp/doc/html/index.html
BSD 3-Clause "New" or "Revised" License
55 stars 18 forks source link

Create harpdiff tool #136

Open svniemeijer opened 7 years ago

svniemeijer commented 7 years ago

This tool should create a new dataset that contains the differences between two data input dataset. The input datasets need to both have a collocation_index variable (which is then used to match up pairs) or need to already be temporally aligned.

There are different ways to calculate differences and we should think about supporting the following (using 'x' and 'y' as names for the datasets, and using 'x-y' as the baseline difference) types. These are the postfixes that should be added to the variablenames:

Calculating differences will only be support for variables that have a unit attribute (which may be empty for unitless quantities; but should not be omitted).

Also make sure to add a 'point distance' difference if both datasets have (center) lat/lon values. How do we name the lat/lon point distance?

Do we wan’t surface overlap fraction, area_distance, area_overlap_fraction/area_intersection_fraction?

For all types of differences we should also add uncertainty propagation:

Some quantities may require special treatment for the calculation of the difference:

We may also want to add differences of intervals (in terms of intersection length):

svniemeijer commented 6 years ago

We should create this in such a way such that the core function is a C library function that returns a HARP product. We can then also introduce a harp.diff python function that returns the difference of two products/datasets (which uses the same underlying code).

svniemeijer commented 6 years ago

See also wikipedia. It is probably better to use _diffabsrelavg: 2|x-y|/(|x|+|y|). And we might also want to distinguish absolute/signed differences vs. absolute/signed scaling for relative differences. For instance, we might want to use absolute scaling for a signed difference: 2(x-y)/(|x|+|y|)