RcppCore / RcppEigen

Rcpp integration for the Eigen templated linear algebra library
110 stars 40 forks source link

Map dgCmatrix as integer sparse matrix #132

Open yanzhaobiomath opened 10 months ago

yanzhaobiomath commented 10 months ago

Dear developers,

I have a dgCmatrix in R and want to call a function as below in cpp to deal with it.

Eigen::SparseMatrix calc_overlap(Eigen::Map<Eigen::SparseMatrix\<int>>& cc_adj, Eigen::Map<Eigen::SparseMatrix\<int>>& cg_adj, double threshold) { Eigen::SparseMatrix\<int> overlap_mat_all = cc_adj * cg_adj; return overlap_mat_all; }

The dgCmatrix 'cc_adj' and 'cg_adj' in R are large sparse matrices with integer values. I want to calculate the product of them by cpp. I used 'Map' to get rid of deep copies of these two large matrices. The scalar \<int> is used to save memory. There is no error raised up when mapping to a dgCmatrix with \<int>. but I'm not sure how it's working exactly. I guess the function makes deep copies instead, so the mapping doesn't work at all. Is it right?

My concern is that since you only allow mapping to dgCmatrix format sparse matrix for now, is there any possibility that you could wrap up a function to deal with 'ngCmatrix'?

Looking forward to your kind reply.

eddelbuettel commented 10 months ago

Hi @yanzhaobiomath and thanks for raising this. I have to ponder this for a biit but yes so far sparse matrices have mostly focussed on double values.

yanzhaobiomath commented 10 months ago

Thanks for replying! Could you help to explain how this function can work? what I don't understand is why passing dgCmatrix to a Eigen::Map<Eigen::SparseMatrix>& format sparsematrix to cpp doesn't raise up an error? Passing a dgCmatrix to Eigen::Map<Eigen::SparseMatrix>& 'matrix', will cpp make a copy of the dgCmatrix or just point to the address of it?

eddelbuettel commented 10 months ago

While I still look after RcppEigen, I personally work more with RcppArmadillo which also has decent sparse matrix support. If I were in your situation I'd try to construct a C++-only-no-R little example of passing the data around -- and then extend from it to interfacing with R.

Also in Eigen::Map<Eigen::SparseMatrix****>& it looks a little suspicious to have a pointer to pointers of pointers. Sure that is what you need/want ?

yixuan commented 10 months ago

A side note: if you just need to compute the matrix product, the Matrix package already does this using low-level C code. Using Rcpp/RcppEigen would not be significantly faster.

yanzhaobiomath commented 10 months ago

While I still look after RcppEigen, I personally work more with RcppArmadillo which also has decent sparse matrix support. If I were in your situation I'd try to construct a C++-only-no-R little example of passing the data around -- and then extend from it to interfacing with R. Good idea, Thanks!

Also in Eigen::Map<Eigen::SparseMatrix****>& it looks a little suspicious to have a pointer to pointers of pointers. Sure that is what you need/want ? No, only one pointer is enough. I removed the redundant pointer. Thanks!

yanzhaobiomath commented 10 months ago

A side note: if you just need to compute the matrix product, the Matrix package already does this using low-level C code. Using Rcpp/RcppEigen would not be significantly faster.

Thanks for the suggestion. The function is not only for matrix multiplication. There was a time I tried to use 'Matrix' , I used 'Matrix' to calculate the product and pass the result to the cpp to do the remaining calculation, it shows doing everything with cpp works faster.