Solving a system of linear equations `LinearSolve.solve` uses up more memory when `A` is sparse then when `A` is dense

SciML / LinearSolve.jl

LinearSolve.jl: High-Performance Unified Interface for Linear Solvers in Julia. Easily switch between factorization and Krylov methods, add preconditioners, and all in one interface.

https://docs.sciml.ai/LinearSolve/stable/

Other

242 stars 52 forks source link

Solving a system of linear equations `LinearSolve.solve` uses up more memory when `A` is sparse then when `A` is dense #342

Closed lampretl closed 1 year ago

lampretl commented 1 year ago

In Julia 1.9.1 I ran the code below with the library LinearSolve.jl v2.3.0:

using SparseArrays, LinearSolve
n, s = 5*10^4, 20;   A, b = sprand(n,n,s/n), sprand(n,s/n);

@time (x = Array(A) \ Array(b));   println(maximum(abs.(A*x-b)))  # solving with a dense matrix
# 696.465488 seconds (10 allocations: 37.254 GiB, 0.00% gc time)
# 9.847483939395829e-12

@time (prb = init(LinearSolve.LinearProblem(A,b));  x = LinearSolve.solve(prb).u);   println(maximum(abs.(A*x-b)))
# 614.745483 seconds (449 allocations: 54.893 GiB, 0.00% gc time)
# 2.503693363742343e-10

As we can see, the timing is comparable, but the sparse version requires more memory than if we convert A to a dense matrix and compute with that, making the method not particularly useful. The whole point of sparse arrays is to save up on memory by not storing the abundant zeros.

I've mentioned this behaviour in https://github.com/SciML/LinearSolve.jl/issues/340 and opened a similar issue in https://github.com/JuliaSparse/SparseArrays.jl/issues/412

rayegun commented 1 year ago

One thing to note in general is that sprand is pretty terrible for benchmarking all sparse linalg codes. Do you observe similar results with matrices from the real world / SuiteSparse collection?

I suspect, but haven't verified, that for LU sprand is generating massive fill in. Something you'll likely see less of in real matrices.

rayegun commented 1 year ago

For instance I just factored the torso3 matrix from SuiteSparse collection. nnz(F.L) / length(F.L) == 0.005. While for a random matrix I have a ratio of something like 0.2 or even 0.5 in one run. Fill in at that level is much too high in my experience.

If your matrices do indeed exhibit these random patterns (or they are very large), you should perhaps check out something like an iterative method where fill-in is a non-issue.

j-fu commented 1 year ago

Well, yes, sprand generates sparse matrices which are not typical for any kind of problems I know about.

For matrices from elliptic/parabolic PDE discretizations (this is my field), things very much depend on the space dimension and the resulting different levels of fill-in:

1D: no alternative to direct solvers
2D: YMMV, but at the end, direct solvers work more or less as a blackbox
3D: For any larger problem, fill-in will kill all performance of direct solvers, try to use preconditioned Krylov methods With e.g. saddle point problems, things are more complicated though.

I am working on some examples, guess I'll make a blog post out of this.

j-fu commented 1 year ago

See https://j-fu.github.io/marginalia/julia/scalingtest/

lampretl commented 1 year ago

@j-fu Thank you for your blog and the comparison between efficiency of different solvers! I'm a bit skeptical about the notion of real matrices. In science, sparse matrices can be close to diagonal or upper/lower diagonal, and of course, solving such systems will cause much less fill-in. But not all of them are like that.

In my research in homological algebra and algebraic topology, when generating chain complexes of simplicial complexes for computing (co)homology, matrices were always very sparse and often avoided fill-in. But for some particular cases, e.g. the chessboard simplicial complex, when I tried to compute the rank of a matrix, there was so much fill-in the computation didn't finish after 1 month.

I think what matters most are the type of matrices that the industry actually needs. Solving such problems would offer more funding to Julia :). @Wimmerer Does SuiteSparse matrix collection contain sparse matrices directly from the industry? I can't tell which ones originated outside of academia.

rayegun commented 1 year ago

@drtimothyaldendavis can give more info about the matrix collection. But there are often descriptions on each matrix.

DrTimothyAldenDavis commented 1 year ago

For use in direct factorization methods, matrices from "sprand" are not truly sparse, in the sense of Wilkinson. They provably take O(n^3) time and O(n^2) space, under modest assumptions. See https://www.ams.org/journals/mcom/1974-28-125/S0025-5718-1974-0331756-7/ .

Essentially no matrices arising in practice have this characteristic, even those that do not have a small bandwidth. There could be exceptions to this rule, like the matrices you're seeing from the chessboard simplicial complex.

So "sprand" is not a good way to test any sparse direct method.

Many of the matrices (perhaps most) in my collection come from industry, or from a practical problem in academia. Very few of them are fabricated (exceptions include this set: https://sparse.tamu.edu/Gset but it's marked as such).

lampretl commented 1 year ago

Thank you, appreciate your insight! @DrTimothyAldenDavis