Closed lruthotto closed 9 years ago
This optimization would be too special case to put in by itself, but perhaps we can do something slightly more general to improve the performance in such a case.
Cc @tanmaykm
It's a little slower than your version that modifies the data fields directly, but I like blkdiag(spzeros(N,0), A)
for this kind of thing (part of why adding blkdiag
was just about the first thing I did with Julia).
What's even more worrying is the fact that ==
on sparse matrices is going to the generic implementation in abstractarray.jl
that does elementwise getindex. Something like a short-circuiting version of sparse subtraction would be much faster.
Good point, we should fix ==
.
Indeed blkdiag seems to be more efficient. I will have a look at its implementation to learn what the difference to vcat is.
Talking about comparision: Also element-wise operators such as .==, .<=, .>= are using abstractarray.jl and return dense matrices.
We should probably open a separate issue for all the operators that need better implementations. Probably just a matter of adding operators to the right code generation macro.
Probably dot-wise comparison operations between sparse matrices and scalars will have to return dense matrices anyways, but we need to have efficient implementations rather than using the abstractarray implementations.
Generally those elementwise comparisons that are true for equality should return dense - keep in mind the implicit zeros. (Or as @StefanKarpinski would say, they would be a good use case for sparse-with-nonzero-default.)
blkdiag
is almost identical to hcat
. Since we're using compressed sparse column storage by default (there may soon be alternatives), it's very simple to concatenate column-wise, difficult to concatenate row-wise. It might be worth looking into improvements in vcat
when a large number of successive columns of one input or the other are empty. Non-copying slices should also help significantly, from profiling most of the time is on these 2 lines.
+1 for a meta-issue that has a checklist for all operations you'd reasonably want to do on sparse matrices so we can get them all.
Especially if/when we add additional sparse types (CSR, COO, block versions thereof?), we'll start wanting convenient & efficient operations between different types of sparse matrices. Doing it all cleanly might require some higher-level pondering of how to best design the system to avoid haphazard explosion of number of different operations we need to write. (Same goes for the whole family of dense array types too tbh, not saying I have a solution but it's something to start thinking about in case anybody else does.)
Indeed. I strongly feel that work on those new sparse matrices should start in a package before getting PRed against julia, its going to take some tire-kicking.
I think we should seriously consider putting all sparse support in a package so that it can get the focused attention that it deserves (and needs).
The same could be said of Diagonal, Bidiagonal, Tridiagonal, Triangular, Banded, Symmetric, Hermitian, etc. Once default (and precompiled) packages exist, sure it should move out of Base along with FFTW, possibly GMP and MPFR, maybe even PCRE if that would be remotely possible.
What Base will eventually need to provide is a higher-level framework for making it easier to create new array types and have them interoperate with one another, in a nicer more performant way than falling back to naive generic operations or writing a combinatorial number of specialized routines. That will be hard, and is yet to be designed much less implemented.
Putting all sparse support in a package would lead to a huge amount of deprecation. It already gets the attention it needs. Also @tkelman is right that if we want to do such a thing, we should do it for many other things. New sparse formats should start out as packages, but perhaps not CSR, as that needs serious code sharing with CSC. However, packages get far less visibility than PRs to base, so serious tire kicking only happens after something is on master.
Let's move much of this discussion that is not related to this issue to the mailing list or separate issues. It is guaranteed to get lost here.
Putting even the most basic support for sparse matrices out in a package is going to be a big headache for virtually every package in JuliaOpt for example. JuMP etc already take long enough to load that it hurts the Julia pitch to people in my field, for whom those packages are the unique selling point of Julia (many kudos deserved to @mlubin @iainnz etc for making this the case).
Get package precompilation working and solid first, then worry about shrinking base when it'll be less painful to do so.
(with apologies to Viral - the closest thing to package precompilation issue would be what, #4373 ?)
+1 for first focusing on having packages load much faster then worrying about separating things from Base.
While I agree that it would be great to first have fast package loading and then shrinking based, these things are less coupled than one might think.
One simple technical solution is to pull the default packages during make
and precompile the code into the system image.
The point here really is to make thinks more modular. If the sparse matrix functionality is developed in base, nobody can simple test it without compiling Julia from source. Further, within a package one does not have to add deprecations as the user could simply rely on an older version if he/she does not want to update to a new interface.
Turns out this has more to do with type instability than special casing. Submitting a PR soon.
With the PR above, we are now significantly faster and memory efficient.
julia> N = 1000000;
julia> A = sprandn(N,N,1e-5);
julia> Z = spzeros(N,N);
julia> @time B = [Z;A];
elapsed time: 0.15032013 seconds (160 MB allocated, 11.98% gc time in 1 pauses with 1 full sweep)
Amazing! Thanks @ViralBShah!
I repeated the above example an equivalent system an got and incredible (around 10 times) speedup. Hope this PR gets merged soon.
elapsed time: 0.104847899 seconds (160 MB allocated, 34.00% gc time in 2 pauses with 1 full sweep)
This is merged.
I found a performance issue when vertically concatenating sparse matrices. Here is a minimal example:
This should actually be a trivial operation. Note that it is equivalent to:
Does anyone have an elegant fix for this? I think a speedup here will be quite interesting for many people working with numerical PDEs etc.