Closed davharris closed 7 years ago
I changed the code to use stats:cov.wt for both methods. Thanks for the report.
I don't think this change is desirable. davharris, in your example, you are using the default settings, which is for an unbiased estimator in the presence of frequency weights. However, your weights add up to 1, which essentially means that you only have one effective observation, so the variance cannot be estimated in an unbiased fashion.
Moreover, I believe stats::cov.wt is geared toward normalization weights, not frequency weights (multiply your weights by any positive finite constant and observe that the return value of stats::cov.wt is unaffected).
However, if we really want to have stats::cov.wt do the heavy lifting in the case of unbiased estimation under normalization weights, then I would propose the following (which, apart from the added warning, should be backwards compatible to version 4.0-2),
wtd.var <- function(x, weights=NULL, normwt=FALSE, na.rm=TRUE,
method = c('unbiased', 'ML'))
{
method <- match.arg(method)
if(!length(weights)) {
if(na.rm) x <- x[!is.na(x)]
return(var(x))
}
if(na.rm) {
s <- !is.na(x + weights)
x <- x[s]
weights <- weights[s]
}
if(normwt)
weights <- weights * length(x) / sum(weights)
if(normwt || method == 'ML')
return(as.numeric(stats::cov.wt(cbind(x), weights, method = method)$cov))
# the remainder is for the special case of unbiased frequency weights
sw <- sum(weights)
if(sw <= 1)
warning("only one effective observation; variance estimate undefined")
xbar <- sum(weights * x) / sw
sum(weights*((x - xbar)^2)) / (sw - 1)
}
@harrelfe, what do you think about @tyner's point?
On Stack Overflow, it was pointed out that the following returns
Inf
as the weighted variance; I'm able to replicate this on my machine (package version: 4.0-0 with R 3.3.0).I haven't looked through all the conditionals in the function, but my best guess is a zero in the denominator with the
"unbiased"
method.