Closed max-vassili3v closed 2 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 89.75%. Comparing base (
47c15ab
) to head (5ec4b3c
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Can you make sure your unit tests are being run? I think just add include("test_sum.jl")
to runtests.jl
added changes
The test failure is very surprising. It's possible its calling sum
and this PR has changed the order of operations.
We should make sure we are consistent with the order. Can you add some tests that sum(A) == sum(Matrix(A))
for A
a banded matrix?
I noticed earlier with == that there is some floating point error and so the test fails.
I thought it was to be expected but this could be the problem
Right, this floating point error is because you are computing the sums in a different order. This is unnecessary so we can change the implementation to make sure we do things in the right order. Eg:
julia> A = randn(5,5)
5×5 Matrix{Float64}:
-0.574303 -0.909723 0.589035 0.125461 -0.85839
2.36645 -2.01842 0.305596 0.739664 0.281112
-0.449434 1.65078 0.293241 -0.12409 0.535829
0.388728 1.3232 -1.61161 -0.54598 -0.237829
-0.570773 -0.0989053 -0.515742 0.116799 -2.14109
julia> sum(A) == sum(vec(A)) # sum should traverse column-by-coumn
true
julia> sum(A) ≠ sum(vec(A')) # it doesn't match row-by-row
true
I thought it was to be expected but this could be the problem
It is expected when the order of the operations change. But all things being equal we should avoid it.
Also, traversing column-by-column will be much faster than row-by-row since it accesses memory in order.
I've changed the traversal order but I still get floating point error on the tests without dims and dims = 1. It's a different type, but I've also noticed that sum(vec(A)) == sum(Matrix(A)) for BandedMatrix A returns false
Can you push your changes? Note I made a suggestion that fixes the order for the no-dims case
It's a different type, but I've also noticed that sum(vec(A)) == sum(Matrix(A)) for BandedMatrix A returns false
Wouldn't a better check for this probably be that sum(vec(A)) == foldl(+, Matrix(A))
?
This is a similar problem in e.g. SparseArrays where A = sprand(1000, 1000, 0.001); sum(vec(A)) == sum(Matrix(A))
returns false but A = sprand(1000, 1000, 0.001); sum(vec(A)) == foldl(+, Matrix(A))
is true
Can you explain the difference?
I take it foldl forces a specific order. Do you know why sum might choose a different order?
As far as I can tell, the difference is that IndexStyle(vec(A)) = IndexCartesian()
(since, for sparse arrays and banded matrices, vec(A)
becomes a reshape
type unlike a normal matrix where it becomes a vector) which uses mapfoldl
, while Matrix(A)
is an IndexLinear()
which uses some sort of block-based summation.
Another way to test would be to check sum(Vector(vec(a))) == sum(Matrix(A))
. I think the implementation in this PR is equivalent to doing a foldl
implementation, so the tests should probably look at sum(B) == foldl(+, Matrix(B))
if I've read it correctly
I think in this case just use ≈ since we don't care that much about preserving order
seems to be the same issue as before with the floating point error
That error is unrelated I believe, we have seen it other places.
I struggled to find an elegant solution involving selecting relevant elements from B.data in the case where B is created by brand() with certain parameters (e.g very non square matrices, more bands than those that fit in the matrix). I decided to go with this solution that involves populating a data matrix only using relevant information accessed by B[band(i)]. Please let me know any improvements.