RcppCore / RcppEigen

Rcpp integration for the Eigen templated linear algebra library
Other
110 stars 40 forks source link

setFromTriplets function gets segfault error #136

Open yanzhaobiomath opened 5 months ago

yanzhaobiomath commented 5 months ago

Hello, I'm using setFromTriplets function to cast a triplet into a sparse matrix which is large. I initialized the sparse matrix with certain dimensions, and reserved the memory for it. But I still get segfault error, saying 'memory not mapped'.

The matrix is 80000*80000 with 50% non-zero elements. Do you have any hint about the reason?

Best, Yan

eddelbuettel commented 5 months ago

We do not have a hint, and cannot do anything with a reproducible example.

Also keep in mind that R has limits on vector length. That does affect the Matrix package and its sparse matrix representation.

yanzhaobiomath commented 5 months ago

Sorry, the file of triplets is too large to be uploaded here, therefore here I give an example which generates a triplet list and also undergoes the same problem:

#include <RcppEigen.h>
#include <string>
#include <Eigen/Dense>
using namespace Rcpp;

// [[Rcpp::depends(RcppEigen)]]

// ' @export
// [[Rcpp::export]]
Eigen::SparseMatrix<double> ComputeSNNasym(int n) {

    typedef Eigen::Triplet<double> Trip;
    std::vector<Trip> trp;

    int idx = 0;
    Eigen::VectorXi xcol(n);

    double overlapping = 0.35;

    int a = n/3*2;

     for (int i = 0; i < n; ++i){  //number of columns ?

        int id = 0;
        for (int j = 0; j < n; ++j){  // Iterate over rows

            int k = i-j;

            if(abs(k) < a){
                trp.push_back(Trip(j,
                               i,
                               overlapping));;
                idx++;
                if(idx == 2147483647){
                    Rcpp::Rcout << "overflowing..." << std::endl;
                }
                id++;
            }
        }
        xcol[i] = id;
    }
     Rcpp::Rcout << "overlapping is done..." << std::endl;

     double sp = (idx+0.0)/n/n;
     Rcpp::Rcout << "number of non-zeros is " << idx << std::endl;
     Rcpp::Rcout << "sparsity is " << sp << std::endl;

    Eigen::SparseMatrix<double> res(n, n);
    Rcpp::Rcout << "initialization is done..." << std::endl;
    res.reserve(xcol);
    Rcpp::Rcout << "reservation is done..." << std::endl;

    res.setFromTriplets(trp.begin(), trp.end());
     Rcpp::Rcout << "sparse is done..." << std::endl;

    return res;
}

You may get the same error when you call this function from R with setting n=80500.

I checked the number of nonzero elements in the matrix, it is more than 2^31, so it's overflowing.

But I got this segfault error in the step of setfromtriplets, which is within the cpp script. I'm trying to figure out the reason. Is RcppEigen library also limited by the vector length of 2^31?

eddelbuettel commented 5 months ago

I am not sure if Eigen is limited but I can assure that R is. We are having the exact same issue in another projects -- the problem is as best as I can tell due to <i,j,x> vectors in standard COO notation. When forming a sparseMatrix object using the Matrix package, then the integer indexing for the vectors i, j and x is the constraint: the 2^31-1 you are aware of. A first guess would be that Eigen has the same problem.

So I am afraid I have no real fix to offer here.

eddelbuettel commented 5 months ago

I know the spam64 package on CRAN explicitly choses a different (64-bit) index type to have larger vectors, but that of course is not integrated with Eigen.

yanzhaobiomath commented 5 months ago

I see, thanks for sharing your experience. I will try to find a way to walk around this issue.