LLNL / sundials

Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
https://computing.llnl.gov/projects/sundials
BSD 3-Clause "New" or "Revised" License
522 stars 130 forks source link

[BUG] out-of-bounds write in SUNMatScaleAddI_Sparse #581

Closed dweindl closed 1 month ago

dweindl commented 2 months ago

Current Behavior:

For the example below, I encounter out-of-bounds writes in SUNMatScaleAddI_Sparse at

https://github.com/LLNL/sundials/blob/c28eaa3764a03705d61decb6025b409360e9d53f/src/sunmatrix/sparse/sunmatrix_sparse.c#L753

Expected Behavior:

No out-of-bounds writes.

Steps To Reproduce:

Environment:

Anything else:

Running the same example with SUNMatScaleAddI_Sparse from sundials 6.6.2 works as expected. It seems the problem was introduced in https://github.com/LLNL/sundials/pull/257.

Steven-Roberts commented 2 months ago

Hi @dweindl, thanks for raising the issue and the nice example to reproduce it. It seems the issue is the sparse matrix addition algorithm is assuming the row indices for a column are sorted which is not the case for your matrix. For example, the column 0 row indices are {0, 1, 3, 2, 4, 10, 7, 8}. I'm not sure if this is a documentation issue or a bug in the algorithm (probably the latter), but I'll discuss with the team tomorrow to plan out a fix. Either way, I see a related out of bounds bug here when N>M: https://github.com/LLNL/sundials/blob/c28eaa3764a03705d61decb6025b409360e9d53f/src/sunmatrix/sparse/sunmatrix_sparse.c#L721

As a temporary fix, you can reorder the row indices to be ascending:

sunindextype indexvals_data[] = {0, 1, 2, 3, 4, 7, 8, 10, 0, 1, 3, 7, 0, 2, 4, 8, 0, 1, 3, 6, 7, 0, 2, 4, 6, 8, 1, 3, 5, 9, 2, 4, 6, 9, 0, 10, 0, 0, 0};

This ran cleanly through valgrind.

dweindl commented 2 months ago

Thanks for looking into that, @Steven-Roberts. I can confirm that ordering the indexvals fixes the problem. So far, (while we were still using sundials 5.8.0,) everything seemed to work fine with unordered indexvals.

drreynolds commented 2 months ago

About 18 months ago, another user submitted PR #257 to adjust how this function was implemented. Essentially, our original implementation (that was in sundials 5.8.0) required O(M*N) loop iterations, instead of the O(NNZ) iterations that one would assume should be necessary. I'm guessing that their efficiency-based fix added an assumption that the indexvals for each row/column are sorted (I'll note that this assumption is satisfied by all of our existing documentation examples, code examples, and unit tests).

I see three options:

  1. We make explicit the assumption that the indexvals are sorted within each row/column. This would update the documentation, retain the current implementation, and potentially require some users to sort the values appropriately.
  2. We revert the current implementation, allowing non-sorted indexvals, but increasing the complexity of the implementation from O(NNZ) back to O(M*N).
  3. Someone devises a new implementation that retains the O(NNZ) complexity but supports non-sorted indexvals.
balos1 commented 1 month ago

Fixed in #584.