Sparse matrix creation routines treat zeros differently. Unify behavior?

cfunky commented 2 years ago

I noticed that CooMatrix::push_matrix does not threshold or detect zero elements in the input. Thus the resulting sparse matrix is more dense than it needs to be. By comparison, constructing a matrix using, e.g., CscMatrix::from, does detect entries that are exactly zero and does not include them in the resulting sparse matrix. If there isn't a rationale for the difference in behavior, may I suggest to make all construction methods detect and discard zero entries? This would incur overhead during construction but could speed up subsequent (particularly repeated) computations performed with the resulting sparse matrix. What are your thoughts on this?

Andlon commented 2 years ago

I personally think not filtering out zero entries in push_matrix is the preferred behavior. If you have many zero entries in the dense matrix you're pushing then you should probably not use this method. I think the current behavior makes things more predictable: for example for the Finite Element Method (which is what I work on), you might rely on the block structure of the sparsity pattern.

CscMatrix::from(&dense) is a bit of a special case: IMO there are rather few circumstances in which you should use this conversion. If your matrix is truly sparse you should probably build it up as a sparse matrix from the beginning (otherwise you suffer massive unnecessary computational and storage complexity). Skipping zero entries makes it a very convenient piece of functionality for writing tests with small sparse matrices, however:

let matrix = CscMatrix::from(&dmatrix![
  1, 2, 0, 0, 0;
  0, 3, 0, 4, 0;
  0, 0, 0, 0, 5;
  0, 0, 6, 7, 8
];

In terms of consistency, I think skipping zero entries should always be explicit. The Sparse::from(&dense) conversions are an exception to this, however. I think when I first implemented these I didn't see any use case in which you would want to keep all of the zeros from a dense matrix. Perhaps we should have had a separate construction method that would filter out explicit zeros - but changing this behavior now is dangerous because it might really silently break some code out there (certainly we'd have to change many tests in nalgebra-sparse).

Andlon commented 2 years ago

If you have a use case for push_matrix where you explicitly want to filter out zeros, however, perhaps a separate method for this purpose would be appropriate?

cfunky commented 2 years ago

Thank you for providing some insight on the rationale for the difference in behavior. I understand that not thresholding zeros can be useful if a certain sparsity pattern is required and that changing the behavior now is probably not a good idea. For my particular use case (robust nonlinear least squares) getting rid of the additional zeros can also be achieved by other means, which avoid constructing them in the first place. As pointed out by you, this is probably the preferable solution.

I think it would be a beneficial to extend the documentation of Sparse::from() and/or Sparse::push_matrix to explicitly point out their behavior with respect to zeros. I was quite surprised when some of my unit tests failed because the sparse matrix constructed by pushing dense blocks (in the code being tested) and the known true result (constructed using sparse::from) had a different sparsity pattern.

Andlon commented 2 years ago

I agree that the docs should be updated to reflect this. Specifically, for From in the top-level docs for the convert module, and of course in the docs for push_matrix.

dimforge / nalgebra

Sparse matrix creation routines treat zeros differently. Unify behavior? #1084