Implement sum - Githubissues

max-vassili3v commented 2 months ago

I struggled to find an elegant solution involving selecting relevant elements from B.data in the case where B is created by brand() with certain parameters (e.g very non square matrices, more bands than those that fit in the matrix). I decided to go with this solution that involves populating a data matrix only using relevant information accessed by B[band(i)]. Please let me know any improvements.

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 89.75%. Comparing base (47c15ab) to head (5ec4b3c).

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #446 +/- ## ========================================== + Coverage 89.61% 89.75% +0.14% ========================================== Files 25 25 Lines 3571 3622 +51 ========================================== + Hits 3200 3251 +51 Misses 371 371 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

dlfivefifty commented 2 months ago

Can you make sure your unit tests are being run? I think just add include("test_sum.jl") to runtests.jl

max-vassili3v commented 2 months ago

added changes

dlfivefifty commented 2 months ago

The test failure is very surprising. It's possible its calling sum and this PR has changed the order of operations.

We should make sure we are consistent with the order. Can you add some tests that sum(A) == sum(Matrix(A)) for A a banded matrix?

max-vassili3v commented 2 months ago

I noticed earlier with == that there is some floating point error and so the test fails.

max-vassili3v commented 2 months ago

I thought it was to be expected but this could be the problem

dlfivefifty commented 2 months ago

Right, this floating point error is because you are computing the sums in a different order. This is unnecessary so we can change the implementation to make sure we do things in the right order. Eg:

julia> A = randn(5,5)
5×5 Matrix{Float64}:
 -0.574303  -0.909723    0.589035   0.125461  -0.85839
  2.36645   -2.01842     0.305596   0.739664   0.281112
 -0.449434   1.65078     0.293241  -0.12409    0.535829
  0.388728   1.3232     -1.61161   -0.54598   -0.237829
 -0.570773  -0.0989053  -0.515742   0.116799  -2.14109

julia> sum(A) == sum(vec(A)) # sum should traverse column-by-coumn
true

julia> sum(A) ≠ sum(vec(A')) # it doesn't match row-by-row 
true

dlfivefifty commented 2 months ago

I thought it was to be expected but this could be the problem

It is expected when the order of the operations change. But all things being equal we should avoid it.

Also, traversing column-by-column will be much faster than row-by-row since it accesses memory in order.

max-vassili3v commented 2 months ago

I've changed the traversal order but I still get floating point error on the tests without dims and dims = 1. It's a different type, but I've also noticed that sum(vec(A)) == sum(Matrix(A)) for BandedMatrix A returns false

dlfivefifty commented 2 months ago

Can you push your changes? Note I made a suggestion that fixes the order for the no-dims case

DanielVandH commented 2 months ago

It's a different type, but I've also noticed that sum(vec(A)) == sum(Matrix(A)) for BandedMatrix A returns false

Wouldn't a better check for this probably be that sum(vec(A)) == foldl(+, Matrix(A))?

This is a similar problem in e.g. SparseArrays where A = sprand(1000, 1000, 0.001); sum(vec(A)) == sum(Matrix(A)) returns false but A = sprand(1000, 1000, 0.001); sum(vec(A)) == foldl(+, Matrix(A)) is true

dlfivefifty commented 2 months ago

Can you explain the difference?

I take it foldl forces a specific order. Do you know why sum might choose a different order?

DanielVandH commented 2 months ago

As far as I can tell, the difference is that IndexStyle(vec(A)) = IndexCartesian() (since, for sparse arrays and banded matrices, vec(A) becomes a reshape type unlike a normal matrix where it becomes a vector) which uses mapfoldl, while Matrix(A) is an IndexLinear() which uses some sort of block-based summation.

Another way to test would be to check sum(Vector(vec(a))) == sum(Matrix(A)). I think the implementation in this PR is equivalent to doing a foldl implementation, so the tests should probably look at sum(B) == foldl(+, Matrix(B)) if I've read it correctly

dlfivefifty commented 2 months ago

I think in this case just use ≈ since we don't care that much about preserving order

max-vassili3v commented 2 months ago

seems to be the same issue as before with the floating point error

dlfivefifty commented 2 months ago

That error is unrelated I believe, we have seen it other places.

JuliaLinearAlgebra / BandedMatrices.jl

Implement sum #446

Codecov Report