Can we simplify the sparse matrix design?

Andlon commented 5 years ago

I'd like to implement support for CSR matrices (because they parallelize much better than CSC), and of course these should follow the same design as the current CsMatrix and friends. However, upon reviewing the current design, I find the design perhaps too generic, and I'd like to discuss possibilities for simplification. This is not just for aesthetic reasons, but also because it complicates a separate issue that I started to write up, but then realized that the current design limits or severely complicates the changes I wanted to propose. The current CSC matrix looks like this:

pub struct CsMatrix<
    N: Scalar,
    R: Dim = Dynamic,
    C: Dim = Dynamic,
    S: CsStorage<N, R, C> = CsVecStorage<N, R, C>,
> {
    pub(crate) data: S,
    _phantoms: PhantomData<(N, R, C)>,
}

with CsVecStorage<N, R, C>:

pub struct CsVecStorage<N: Scalar, R: Dim, C: Dim>
where DefaultAllocator: Allocator<usize, C>
{
    pub(crate) shape: (R, C),
    pub(crate) p: VectorN<usize, C>,
    pub(crate) i: Vec<usize>,
    pub(crate) vals: Vec<N>,
}

To summarize, the following parameters are currently generic for sparse matrices:

Scalar (must be generic)
Rows/column dimensions
The exact storage used for the matrix (which is abstracted by a trait CsStorage)

My understanding is that the rows/cols are abstracted over so that we can treat a sparse vector as a special case of a CsMatrix. This is of course nice, but I am not entirely sure if this alone is worth the generic complexity. Beyond that, I'd like to claim that sparse matrices do not otherwise benefit from having constant dimensions, since in my experience, dense matrices usually outperform sparse matrices for any dimension small enough for compile-time sizes to matter. Moreover, the use cases for very small sparse matrices are in any case few (with the possible exception of diagonal matrices, which in any case would deserve their own specialized storage).

As for storage, I am not quite sure what the motivation is for abstracting over the internal storage. One could perhaps envision having "views" like for the dense matrices, but this is of very limited utility for CSC matrices (it only really makes sense along one axis), and use cases that would benefit from these seem very niche to me.

Moreover, the current design does not abstract over something that it really should abstract over, namely the index type. Currently it enforces usize, but allowing users to e.g. choose u32 (which is enough for the vast majority of applications)` can bring significant speedup due to higher memory throughput.

Now, there are probably additional reasons for the current design, but I think it would be nice to discuss those.

I'd like to propose a simpler design, which is less generic, along the lines of (pseudo code-ish)

pub struct CsMatrix<N: Scalar, I: IndexType>
{
    pattern: Cow<SparsityPattern<I>>,
    values: Vec<N>
}

pub struct SparsityPattern<I: IndexType>
{
    // Using the same names for variables as in the original design
    pub(crate) p: Vec<I>,
    pub(crate) i: Vec<I>,
}

A separate struct would be needed for sparse vectors. Here I've introduced the notion of a SparsityPattern, which contains only the information about the location of the nonzeros. This split is natural, as the sparsity pattern itself can be viewed as a graph, and indeed many graph theoretical operations can be performed directly on the sparsity pattern without the need for the values. Moreover, the exact same SparsityPattern struct can be used for CSR matrices (with p representing rows instead of columns and so on).

The reason for the Cow is the observation that many problems require multiple matrices with the same sparsity pattern. A typical example is Finite Element simulations, in which the mass matrix and stiffness matrix, and indeed a linear combination of those, as is often used in time integrators, all have the same sparsity pattern. With the above pattern, you only need to store it once. I also expect the overhead of Cow to be basically zero for any practical application, as sparse matrices are almost invariably used with "bulk operations", such as sparse-matrix vector multiples or similar, where the pattern can be borrowed once and used throughout the rest. Individual entry lookup is in any case somewhat slow (especially for dense columns), and I think the overhead of Cow here is likely lower than the lookup itself. See the Deal.ii FEM library for a somewhat similar design.

Sorry if my tone seems overly critical: I really like nalgebra and fully support its development, and I'd only like to see it become better!

Andlon commented 5 years ago

In hindsight, I had certainly not thought it through with the Cow proposal: of course, the Cow type needs a lifetime parameter, which would at best pollute the matrix type with a lifetime parameter, which we certainly do not want.

In any case, it is possible to build a similar solution without Cow and a lifetime parameter, but perhaps for simplicity we could just live with redundant clones of SparsityPattern for now.

Andlon commented 5 years ago

I just revisited this topic, and of course simply an Arc<SparsityPattern> would do. Forget about the Cow nonsense up above.

sebcrozet commented 5 years ago

Thanks for those suggestions! The current choice of type parameters are mostly to look like those of dense matrices, though you make valid points regarding their flaws. Overall I agree with all the remarks you do:

Having generic dimensions is quite overkill. Their only purpose here would be to treat sparse vectors and sparse matrices uniformly.
Having a generic storage type is also overkill given that we will always have only the heap-allocated storage type. As you point out, the goal of a generic storage type is to support sparse matrix views uniformly.

In both cases, the goal of having generics is to avoid duplicated code. For example if we have separate types for vectors, matrices, and views (say, CsVector, CsMatrix, CsMatrixView), then all the combinations of sum, multiplication, etc. will have to be implemented manually.

So overall my only concern is code duplication among various sparse entities, but perhaps there is no practical value to even support sparse matrix views or a sparse vector with a dedicated type. It appears that Eigen (which inspired nalgebra for several design choices) made choices that are very close to what you propose here so I suggest we go for it. Also, it's good to start with the simpler types you suggest. Then we will iterate and see if more type parameters are needed depending on the issues you may encounter or feature requests.

Regarding the Cow/Arc comments, yeah, Arc is the right way of doing copy-on-write (using Arc::make_mut) without explicit lifetimes.

Andlon commented 5 years ago

Thanks for the comments, @sebcrozet! I will try to get started on a suggested redesign in the coming weeks. Once I have something we can iterate a bit to see in which direction we want to take it.

Unrelated to this issue, I also started work on porting my assert_matrix_eq from rulinalg to a separate crate, which I thought we could re-use in nalgebra for some convenient comparison functionality for writing tests and similar. I'll probably try to whip this up before starting on the sparse rework.

XVilka commented 4 years ago

Sorry for the distraction, but was there any progress towards implementing them?

Andlon commented 4 years ago

Hi @XVilka: not in any publicly available sense, unfortunately. I'm working on an internal project in which I have bits and pieces of this new sparse matrix design, but I won't have the time to move it into a separate contribution for nalgebra before February at the earliest, as we have a deadline in late January that is currently taking all of my time.

If anyone else is interesting in contributing I could perhaps provide limited mentoring instructions, however, but my time will become increasingly limited as we are nearing our deadline.

Andlon commented 4 years ago

Update: I am actively working on porting, expanding and documenting our internal sparse matrix data structures to nalgebra. There's quite some work to be done, however, so it will take me some time.

My intention is to have complete and ergonomic implementations of COO, CSR and CSC matrices, with basic implementations of data access functionality, algebraic operations (with some restrictions) and conversion between formats. Then hopefully I'll find the time for a subsequent PR with performance improvements on top.

Andlon commented 4 years ago

Another update: I was unfortunately once again delayed by unforeseen events. However, I'm back on track and am making steady progress. Interested users can follow my progress in my sparse-rework branch (note that it might not always be 100% up to date with my local progress, but it should be close).

My plan for the initial contribution is to have a fairly polished and tidy API for working with COO, CSR and CSC matrices. This entails:

[x] Initial, ergonomic APIs for exposing most basic features of the aforementioned formats.
[ ] Reasonably efficient conversions between formats.
[ ] A sparse "BLAS"-like API for the most common sparse matrix operations, including basic arithmetic, matrix-matrix products and sparse-dense products.
[ ] std::ops implementations built on top of the "BLAS"-like API.
Note that performance will not be a major concern for the initial contribution, although in most cases I strive to use algorithms with optimal asymptotic complexity.

In later contributions, I hope to extend the above functionality with:

Extensions to the API for ergonomically composing matrices from blocks (horizontally/vertically concatenating, building block matrices etc.), as well as additional convenience functionality for manipulating sparse matrices.
Possibly extend the API to allow creating matrices from borrowed data (e.g. if you have CSR data from an external source and want to use it with nalgebra matrices).
In the initial contribution, all index types are assumed to be usize. This is inefficient for many applications for which e.g. u32 would be sufficient. Using smaller types where applicable may have drastic consequences for performance, since many/most sparse matrix algorithms are bottlenecked by memory bandwidth.
Improved performance.

However, I hope that after the initial contribution, others might also be interested in extending the API and improving performance (I'd be happy to mentor such contributions).

I also have some code and API design for iterative Krylov solvers that I want to contribute further down the road.

Andlon commented 2 years ago

Closing this in view of #823 and the arrival of nalgebra-sparse.

dimforge / nalgebra

Can we simplify the sparse matrix design? #592