privefl / bigsparser

Sparse matrix format with data on disk
10 stars 3 forks source link

Feature request: add `prodMat` for `SFBM` objects #13

Closed fmorgante closed 7 months ago

fmorgante commented 7 months ago

Hi Florian, would it be hard for you to implement a sp_prodMat method for objects of class SFBM just like you have big_prodMat for objects of class FBM?

privefl commented 7 months ago

I usually simply use apply() on the second matrix, and use sp_prodVec() on each column.

fmorgante commented 7 months ago

Let me backtrack and give you my problem. This is what I am trying to do to obtain DRD as an SFBM, with R and D being SFBM themselves:

library(bigstatsr)
library(Matrix)

R <- matrix(c(1, 0, 0.3,
              0, 1, 0,
              0.3, 0, 1), 3, 3, byrow=T)
#R1 <- as_SFBM(as(R, 'dsCMatrix'))
R1 <- as_FBM(R)

d <- c(2, 3, 4)

# D <- .symDiagonal(x=sqrt(d))
# D1 <- as_SFBM(D)
D <- diag(x=sqrt(d))
D1 <- as_FBM(D)

DR1 <- FBM(3, 3)
big_apply(R1, function(X, ind, D1, res) {
  res[, ind] <- bigstatsr::big_prodMat(D1, X[, ind, drop = FALSE])
  NULL
}, a.combine  = "c", block.size = 1, ncores = nb_cores(), D = D1, res = DR1)

DRD1 <- FBM(3, 3)
big_apply(D1, function(X, ind, DR1, res) {
  res[, ind] <- bigstatsr::big_prodMat(DR1, X[, ind, drop = FALSE])
  NULL
}, a.combine  = "c", block.size = 1, ncores = nb_cores(), D = DR1, res = DRD1)

DRD <- D %*% R %*% D

Do you have any suggestion?

privefl commented 7 months ago

Is D a diagonal matrix?

fmorgante commented 7 months ago

Yes.

privefl commented 7 months ago

Multiplying by a diagonal matrix is so straightforward that you shouldn't explicitly form K = DRD. Instead, e.g. if you need to compute something like DRDx, you can easily do (D(R(Dx))) (cf. https://privefl.github.io/blog/(Linear-Algebra)-Do-not-scale-your-matrix/). If you need Kij, that is simply Rij Di Dj. Basically, try to avoid constructing large matrices when you can.

fmorgante commented 7 months ago

Absolutely. In fact, I would do something like DRD <- t(R * sqrt(d)) * sqrt(d). However, I do not know how to do that when R is an SFBM and the output DRD also be an SFBM. Any suggestions?

privefl commented 7 months ago

If you really want to explicitly make the new matrix DRD, you can probably do it by chunk with big_apply() and $add_columns(DRD_ind), where DRD_ind = Matrix::Diagonal(sqrt(d)) %*% R[, ind] %*% Matrix::Diagonal(sqrt(d[ind])).

fmorgante commented 7 months ago

I have tried your suggestion, but I have not gotten it to work.

library(bigstatsr)
library(bigsparser)
library(Matrix)

R <- matrix(c(1, 0, 0.3,
              0, 1, 0,
              0.3, 0, 1), 3, 3, byrow=T)
R <- as_SFBM(as(R, 'dsCMatrix'))

d <- c(2, 3, 4)

DRD <- as_SFBM(Matrix::Diagonal(x=sqrt(d)) %*% R[, 1] %*% Matrix::Diagonal(x=sqrt(d[1])))

funfun <- function(R, d, ind, DRD) {
  DRD_ind <- Matrix::Diagonal(x=sqrt(d)) %*% R[, ind] %*% Matrix::Diagonal(x=sqrt(d[ind]))
  DRD$add_columns(DRD_ind, nrow(DRD))
  NULL
}

big_apply(R, a.FUN = funfun, a.combine  = "cbind", block.size = 1, ncores = nb_cores(), DRD = DRD, d = d)

What am I doing wrong?

privefl commented 7 months ago

A few things:

Minor comments:

fmorgante commented 7 months ago

Thanks, Florian! I got it to work following your suggsetions.