DrTimothyAldenDavis / SuiteSparse

The official SuiteSparse library: a suite of sparse matrix algorithms authored or co-authored by Tim Davis, Texas A&M University.
https://people.engr.tamu.edu/davis/suitesparse.html
Other
1.15k stars 259 forks source link

Out of Memory on SparseQR #776

Closed cTatu closed 7 months ago

cTatu commented 7 months ago

Describe the bug Hi, I installed SuiteSparse in our cluster to perform QR decomposition on big sparse matrices. I'm using the python wrapper PySPQR. For small matrices everything works fine but when I try with our matrices it's throwing out of memory ERROR. The node in which I performed the test has 90GB of memory and the OOM error was thrown when the process was using only 7GB.

To Reproduce

>>> import numpy
>>> import scipy
>>> import sparseqr

>>> M = scipy.sparse.rand( 10, 10, density = 0.1 )
>>> Q, R, E, rank = sparseqr.qr( M )
>>> Q
<10x10 sparse matrix of type '<class 'numpy.float64'>'
        with 12 stored elements in COOrdinate format>

>>> M = scipy.sparse.rand( 4_000_000, 16, density = 0.1 )
>>> Q, R, E, rank = sparseqr.qr( M, economy=True )
CHOLMOD error: out of memory. file: SuiteSparse-7.6.1/CHOLMOD/Utility/t_cholmod_malloc.c line: 59
Segmentation fault

Desktop (please complete the following information):

Additional context

This is the cmake command I used for configuring the project:

cmake -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DSUITESPARSE_USE_CUDA=OFF -DSUITESPARSE_ENABLE_PROJECTS="suitesparse_config;spqr" -DBLA_VENDOR="Intel10_64ilp" -DSUITESPARSE_USE_64BIT_BLAS=ON ..

here is the output cmake_output.txt

DrTimothyAldenDavis commented 7 months ago

You’re trying to factorize an Erdos Renyi random sparse matrix. At a certain threshold a random sparse matrix takes O(n^2) memory and O(n^3) time.

Still worse you are trying to Q in its matrix form, not as its compact Houselholder representation. Your resulting Q matrix will be n by n with about O(n^2) memory. That’s pretty big since n is 4 million. It’s impossibly large and the wrong thing to ask for. A householder representation would take no more the n times m memory with m being 16.

DrTimothyAldenDavis commented 7 months ago

So this isn't a bug. If you need Q, you should ask pySPQR to return it in its compact Householder form.

The reason you run out with just 7GB in use at the time is that I do two mallocs: one to hold all the nonzero values of Q (in double) and one to hold the pattern (int64 or int32). The malloc of the double values would probably be a single malloc of nn8 bytes, or about 128 petabytes.