Closed GabrielHoffman closed 9 months ago
I wrote some code the address this:
# Compared weighted variance along *rows* of X
# where each element of X has its own positive weight
rowWeightedVarsMatrix = function(X, W){
stopifnot(dim(X) == dim(W))
if( ! is.matrix(X) ) X <- as.matrix(X)
if( ! is.matrix(W) ) W <- as.matrix(W)
# scale weights to have a mean of 1
Ws <- W / rowMeans2(W, useNames=FALSE)
# weighted mean
mu <- rowMeans2(X * Ws)
# weighted deviation
A <- sqrt(Ws)*(X - mu)
# sum of squares, divided by n-1
rowSums2(A ** 2, useNames=FALSE) / (ncol(X)-1)
}
# testing code
library(modi)
library(matrixStats)
set.seed(1)
n = 100
p = 200
X = matrix(rnorm(n*p), n, p)
W = matrix(rgamma(n*p, 14, 1), n, p)
res1 <- sapply(seq(n), function(i) weighted.var(X[i,], W[i,]))
res2 <- rowWeightedVarsMatrix(X, W)
# compare difference
range(res1 - res2)
Hi Gabriel,
Some background: MatrixGenerics was developed to provide an S4-based set of generics that mimics the matrixStats API. MatrixGenerics further provides some 'basic' methods (i.e.., those for matrix objects) but it does so simply by calling the corresponding matrixStats function. Other specialised methods (e.g., for DelayedMatrix and dgCMatrix objects) are provided in add-on packages (e.g., DelayedArray, DelayedMatrixStats, sparseMatrixStats).
So to address your feature request, I think you would need to first convince Henrik that your proposed extension to rowWeightedVars()
(and presumably also colWeightedVars()
?) would make a worthwhile addition to matrixStats.
Then, we could mimic that in MatrixGenerics by making any necessary changes to the generic (and documentation), defer to the corresponding matrixStats method for matrix objects, and notify the authors of add-on packages that they may want to consider a corresponding specialised method in their package.
Does that make sense? Thanks for your suggestion and I hope that my response doesn't seem discouraging.
Cheers, Pete
Thanks for the detailed reply. I'd always wondered about MatrixGenerics vs matrixStats, so it's good to have a better understanding.
The code I provided above is vectorized pretty well and uses matrixStats already, so I'm set for my purposes.
Best, Gabriel
rowWeightedVars(X,w)
currently handles a matrixX
, the weightsw
must be a vector. Is it possible to extend this function handle a weights matrixW
the same dimension asX
?I have an application in single cell genomics where
X
andW
are bothDelayedArray
objects.I posed a similar question under
matrixStats
, but I think your package is the right place for itBest, Gabriel