JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.68k stars 5.48k forks source link

Sparse hvcat with adjoints is very slow (orders of magnitude) #38209

Closed eschnett closed 3 years ago

eschnett commented 4 years ago

I find that hvcat with sparse matrices is orders of magnitude slower than expected when adjoints of sparse matrices are involved. Here is an example:

julia> using LinearAlgebra

julia> using SparseArrays

julia> n0 = 100000;

julia> n1 = 200000;

julia> N0 = spzeros(Bool, n0, n0);

julia> d = Int8.(sprand(Bool, n1, n0, 1/(n0*n1)));

julia> # δ = Float64.(sprand(Bool, n0, n1, 1/(n0*n1)));
       δ = Float64.(d)';

julia> E1 = Diagonal(ones(Bool, n1));

julia> @time [N0 δ; d -E1];
 82.412894 seconds (176 allocations: 28.276 MiB)

When I replace the definition of δ by the one that does not involve an adjoint, the calculation is much faster:

julia> δ = Float64.(sprand(Bool, n0, n1, 1/(n0*n1)));

julia> @time [N0 δ; d -E1];
  0.068553 seconds (2.40 M allocations: 193.794 MiB)

I aborted one such very slow calculation, and found that typed_hvcat calls setindex!. If this is indeed the case, then this explains why this is very slow. Sparse matrices need to be constructed by calling sparse for efficiency.

eschnett commented 4 years ago

Another way to avoid this slowdown is to replace -E1 by sparse(-E1).

Maybe the type signature of the sparse array hvcat method isn't generic enough?

eschnett commented 3 years ago

The problem is a missing constructor for SparseMatrixCSC{Float}(::Diagonal{Int}), which then treats the diagonal matrix as AbstractMatrix, and loops over all its indices checking for zeros. I'm working on a pull request.