EcosystemEcologyLab / SW_Biomass

Comparing data products for AGB
https://ecosystemecologylab.github.io/SW_Biomass/report.html
MIT License
0 stars 0 forks source link

Determine metric for "non-agreement" #5

Closed Aariq closed 9 months ago

Aariq commented 9 months ago

What metric should be used to represent how well the datasets agree or don't agree with eachother? Something like RMSE might get at it, but would be large for a situation where 5 datasets all overestimate AGB by the same amount and the other 5 all underestimate AGB by the same amount, right? All the datasets would be far from the mean even though there would be a high(?) level of "agreement". I'm sure there is a better statistic for this.

Aariq commented 9 months ago
Aariq commented 9 months ago

https://www.r-bloggers.com/2021/06/intraclass-correlation-coefficient-in-r-quick-guide/

Aariq commented 9 months ago

2d cross correlation? https://observablehq.com/@lemonnish/cross-correlation-of-2-matrices

Aariq commented 9 months ago

Normalize to z-scores before comparing?

Aariq commented 9 months ago

I think maybe there are two separate things here that we might want to calculate. First, is a single number (like a correlation) describing how well two datasets "agree".

The mantel statistic is essentially a correlation between two matrices. In R, a mantel statistic can be calculated with vegan::mantel(), which also uses permutation to get a p-value. You can turn off the permutation test and just get the stat with permutations = 0. cor(). We could calculate correlations to see how well each data product agrees with the ESA product, which would match the scatter plots #18 #10.

The second thing is a statistic for each pixel across all the data products with the purpose of showing spatial variation in "agreement" across the datasets. Right now we are doing that with standard deviation, which shows where in the state there is more variation among the data products, but might not necessarily show "agreement".

Aariq commented 9 months ago

Closing for now as I've added correlation coefs to scatter plots and I think SD is the best option for the map for now