Closed lbusett closed 4 years ago
Yes, I can add variance etc. soon.
So, I'm a bit unsure of how to handle coverage fraction when calculating variance.
I'm using this dummy data:
values <- c(3.4, 2.9, 1.7, 8.8, -12.7, 100.4, 8.4, 11.3)
weights <- c(1.0, 0.1, 1.0, 0.2, 0.44, 0.3, 0.3, 0.83)
I've implemented formula WV2 from West (1979) and am trying to check it against other implementations. I've seen this weighted.var
referenced a couple of times. There are also implementations in various R packages.
> weighted.var(values, weights)
[1] 817.1162
> Hmisc::wtd.var(values, weights)
[1] 882.4848
> Hmisc::wtd.var(values, weights, normwt=TRUE)
[1] 817.1162
> Weighted.Desc.Stat::w.var(values, weights)
[1] 670.8578
> SDMTools::wt.var(values, weights)
[1] 817.1162
I've tentatively gone with the version resulting in 817, though I'm not confident of this. The 817 comes from the weights being interpreted as "reliability weights", 882 comes from them being interpreted as "frequency weights", and 670 is a population variance.
Any thoughts on the correct choice here?
Looks like ArcGIS is using the population value: https://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/h-how-zonal-statistics-works.htm#GUID-CB4BBDF2-7A34-427D-A8EA-1BA1C0DB265E
@dbaston
I am no statistician, so take this for what is worth... From a quick look, the "frequency weights" option does not seem to be useful in this use case, since weights are interpreted as counts of the values of the distribution:
values <- c(1, 1, 10)
weights <- c(1,1, 10)
Hmisc::wtd.var(values, weights)
[1] 12.27273
var(c(1,1, rep(10,10)))
[1] 12.27273
About the other two alternatives, I really do not know which one would be better...
I've gone ahead and implemented population value for consistency with ArcGIS and QGIS. I'll wait for someone to request the others if there's an application for them.
Thanks!
Hi,
Since I am in the process of "switching" my "R" packages to usage of exact_extractr (abandoning velox, or "standard" raster::extract): would it be possible/useful to include additional statistic indicators, such as variance/standard deviation/variation coefficient ? I know I can extract all "pixel" values and then compute them myself, but I think having them ready-made could be useful.
HTH,
Lorenzo