Closed tkelman closed 5 years ago
Some thoughts:
Diagonal
or SymTridiagonal
- and "sparse matrices" which have more complicated/general sparsity patterns.Diagonal
, Bidiagonal
, SymTridiagonal
, Tridiagonal
, etc., which can be considered special cases of banded matrices, which we currently don't represent in Julia. (LAPACK type GB
for general banded; also the more exotic types PB
for symmetric positive definite banded and TB
for triangular banded.)The second broad category of structured matrices encapsulate matrices with known global symmetries like Symmetric
or Hermitian
. These special matrix types are really thin wrappers around dense matrices; the types are used primarily for dispatch rather than to really encapsulate a different kind of storage layout from Array{T,2}
. This category of special matrices also have multiple representations; in particular, LAPACK distinguishes between conventional storage and packed storage formats:
special matrix | conventional | packed |
---|---|---|
complex symmetric | SY |
SP |
symmetric/Hermitian positive definite | PO |
PP |
symmetric/Hermitian indefinite | SY /HE |
SP /HP |
triangular | TR |
TP |
In theory, packed matrices offer more compact memory usage. It's not clear, however, that this always translates to more performant routines over packed matrix formats relative to their conventional dense storage counterparts.
SB
symmetric banded, where LAPACK has only eigenproblems, no factorizations at all - Linda Kaufman has a relatively-recently-published (2007) algorithm for doing exactly this. This is the kind of active research that Julia can be faster about implementing than LAPACK can. And there's no reason to only talk about dense here, all of these structural properties apply equally in the sparse case.symmetry is not a dense-only property in any way. Let's not put it into Julia as if it is.
True. @andreasnoackjensen just merged #7992, which makes the symmetric matrix types parametric on other matrix types.
I didn't mean to imply that sparse matrix support is second-class or unimportant. It's just a messier problem and I don't know a good strategy going forward.
I've been thinking about this recently in terms of matrix multiplication (particularly for #6837): the key problem here is the quadratic explosion of methods (every additional type has to implement multiplication with every other type). Is there a generic way we could implement matrix multiplication so that only O(1) additional functions need to be defined?
At their core, all the multiplication methods basically differ in the way by which they iterate over pairs of elements from each matrix, (column-wise, row-wise, block-wise, etc.). Is there a way we could define these various ways to iterate for each matrix type, along with an associated "cost" (i.e. CSC is cheap by column, very expensive by row), and then the multiplication could just choose the most efficient iteration scheme?
@tkelman Thanks for opening this issue. Much of the development of the linear algebra functionality has been organic and hasn't followed a master plan. When Symmetric
was introduced, it was a way to dispatch to the symmetric eigen solver directly and only later on the broader usefulness of a Symmetric
type was recognized which eventually made the type develop into the version merged this morning where it is parametrized on the matrix type. I am working toward making the linear algebra more generic such that the types don't require a specific type. Now Triangular
and Symmetric/Hermitian
allow all kinds of AbstractMatrix
types and I plan to extend this to the other structure types such that you can have e.g. a Diagonal
with matrix elements as a way of storing a block diagonal matrix. Hopefully this is in line with some of what you are asking fore.
It is a problem that the potential number of methods grows so fast in the number of types. I don't think it is feasible to support all possible combinations. Right now some of the types try to cover more broadly by promoting explicitly to dense matrices. It is possible that this could be done in a more automatic way, but I haven't figured out how and actually I'm not sure if we really want it, because a no method error might be better than a huge sparse matrix getting promoted to a dense matrix implicitly. At least temporarily, we might want to force the user to be explicit about the order of evaluation of expression involving special matrices, e.g. something like A*(B*x)
to avoid the matrix-matrix multiplication between two very complicated matrices.
You are more experienced with sparse matrices than I am so please continue to share you thoughts on these issues. There is still plenty work to do.
I definitely appreciate Symmetric/Hermitian
and Triangular
being templated on the matrix type. I must admit that when I first saw the type of matrix as one of the template parameters I was a bit perplexed as to why this was done. I didn't think of sparse matrices although in retrospect it is a natural direction to extend these.
When solving sparse systems knowing that the system matrix is triangular is a big win and knowing that it is symmetric/Hermitian helps you choose the type of decomposition and representation. Some of the recent code from the Watson Sparse Matrix Package (WSMP) or HSL (what used to be called the Harwell Subroutine Library) allow for a Bunch-Kaufman decomposition in addition to a Cholesky decomposition. That is, it is not necessary to have a positive definite sparse symmetric matrix in order to take advantage of symmetry in solving sparse systems.
It may be necessary to enforce the underlying structure in a symmetric/Hermitian or Triangular sparse system so that unnecessary copying can be avoided. For example all the BLAS/LAPACK routines that apply to (non-packed) triangular matrices ignore the contents of the strict triangle not indicated by the uplo flag. For sparse matrices it may be necessary to ensure that the data
member of the type is a lower/upper triangular matrix or to pay the price of copying through tril or triu. This would mean creating methods for arithmetic, etc. applied to symmetric matrices stored in only one triangle. WSMP allows two formats for symmetric sparse matrices, as a triangle stored in CSC format or as an MSC format in which the diagonal is stored separately as a dense vector and only the off-diagonals from one triangle are stored in CSC format.
imply that sparse matrix support is second-class or unimportant
No, you didn't do that, the current state of Julia's standard library does that by itself :/
@simonbyrne I was thinking of something similar this morning, isrowiterable()
or iscolumniterable()
or isdiagiterable()
etc. Transpose as view in the dense case, or as reinterpreting the same data between CSR vs CSC in the sparse case, should help start towards cleaning up the Ax_mul_Bx[!]
system. It's not just matrix multiplication though, there are all the reductions and various other unary, binary, logical operators to worry about too.
Right now some of the types try to cover more broadly by promoting explicitly to dense matrices. It is possible that this could be done in a more automatic way, but I haven't figured out how and actually I'm not sure if we really want it, because a no method error might be better than a huge sparse matrix getting promoted to a dense matrix implicitly.
Right, I think the generics in base are not sufficiently generic for this reason. I shouldn't have to wait 2 minutes for mean(speye(10^5), 1)
, and min(speye(10^5), 1)
shouldn't give a memory error. Some of this can be solved with more code, but that only solves one missing method at a time for one sparse type at a time. Are there ways we can make the generic implementations smarter (hopefully without hurting dense performance), working over iterators of indexes perhaps? You've probably heard the Mike Vanier quote "If I had a nickel for every time I've written for (i = 0; i < N; i++)
in C"? I'm starting to feel the same way about for i = 1:A.n, j = A.colptr[i]:A.colptr[i+1]-1
in Julia.
it is not necessary to have a positive definite sparse symmetric matrix in order to take advantage of symmetry in solving sparse systems
In fact there are a number of applications in optimization where it is an absolute requirement to preserve symmetry with an indefinite matrix, in order to obtain the inertia of the matrix (number of positive, negative, and zero eigenvalues).
I just ran into one of these limitations today. If A
and B
are SymTridiagonal
, with B
positive-definite, I wanted to use eigvals(A,B)
to solve the generalized eigenproblem. There was no method. Then I tried eigs
, and it failed for lack of an isposdef!
method. Then I tried converting to sparse(A)
as a fallback, but there is no such method. Finally, I gave up and just used full(A)
. Quite a frustrating experience.
At the very least, we should support sparse(A)
for all of the special sparse (banded) formats.
[It would also be good to have them inherit from something like Banded
(ala @jiahao's cleanup proposal) so that things like \
could convert them all to sparse
as a fallback rather than operating on them as if they were dense matrices.]
@ivanslapnicar just pointed out that A*B*C
fails when all 3 are Tridiagonal
, since this evaluates to *((A*B)::Matrix, C::Tridiagonal)
which is not defined. In contrast A*B
works fine and returns a dense Matrix
. This is surprising behavior.
Update: Even worse, A*(B*C)
works just fine, because we define an A_mul_B!
method that handles *(::Tridiagonal, ::Matrix)
. So *
is not associative. Epic fail.
We don't have a type for quin(t?)diagonal matrices, so Tridiagonal*Tridiagonal returning dense doesn't surprise me. But it would probably surprise someone who wanted Julia to do things quickly for that operation (without specific knowledge of the internals).
I did propose support for arbitrary banded matrices in #8240, so the specific issue I just mentioned can be addressed as part of the broader cleanup. However it does speak to the combinatorial growth problem.
So, now that we have a parameterized Symmetric type, it's very quickly clear that Symmetric{SparseMatrixCSC}
has frustratingly few methods defined.
Leaving a breadcrumb to a bit of a discussion here https://github.com/JuliaLang/julia/pull/10902#issuecomment-94307668
I'm thinking that some of the CartesianIndex
/ eachindex
infrastructure could maybe be adapted and specialized for sparse and other structured matrix types. Could be worth some experimentation and prototyping.
These discussions have generally moved to discourse, and I am closing this issue since there isn't anything directly actionable here.
At present there are a variety of special structured matrix types - diagonal, bidiagonal, tridiagonal, symtridiagonal, banded, triangular, hessenberg, symmetric, hermitian, sparse CSC, and so on - in base Julia. The completeness and performance of these implementations varies widely, since they've been implemented piecemeal and as needed by @viralbshah, @andreasnoack, @jiahao, and many others.
Managing this diversity in an effective way is a hard problem, there are only a handful of well-engineered systems in this area. I hope Julia can improve here -
There's a lot to this so it's probably beyond the scope of 0.4. Many of these specialized array types do not need to be in base long-term, but they should be default and ideally precompiled packages. Moving things out to packages too early, especially before the right frameworks are in place to support good interoperability, could in the worst case lead to fragmentation and lower visibility. I also very badly want to narrow the divide between dense linear algebra and sparse linear algebra here, as the latter is a bit of a bastard stepchild in the current system.
I don't have any concrete proposals here yet, but I think we should open a discussion. The challenges are very different, but I look at how the JuliaOpt team has designed MathProgBase for example, the hourglass architecture has done an amazing job of tying together a diverse set of solvers to multiple modeling environments. Can we similarly identify the narrow waists and common protocols for structured linear algebra?
The near-future implementation of CSR sparse (for which there's already an initial PR, #7029) and arbitrary-dimensional COO sparse might make a good proving ground to start coming up with ideas here.
TL;DR: we have lots of array types, how do we make them work together better?