JuliaSmoothOptimizers / SparseMatricesCOO.jl

Sparse matrices in coordinate format for Julia
Mozilla Public License 2.0
7 stars 4 forks source link

Interface Sparse BLAS routines #33

Closed amontoison closed 1 year ago

amontoison commented 1 year ago

@AntoninKns

codecov[bot] commented 1 year ago

Codecov Report

Base: 84.98% // Head: 79.26% // Decreases project coverage by -5.72% :warning:

Coverage data is based on head (14c0f41) compared to base (c81dc76). Patch coverage: 60.49% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #33 +/- ## ========================================== - Coverage 84.98% 79.26% -5.73% ========================================== Files 3 6 +3 Lines 353 434 +81 ========================================== + Hits 300 344 +44 - Misses 53 90 +37 ``` | [Impacted Files](https://codecov.io/gh/JuliaSmoothOptimizers/SparseMatricesCOO.jl/pull/33?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaSmoothOptimizers) | Coverage Δ | | |---|---|---| | [src/coo\_mkl\_wrapper.jl](https://codecov.io/gh/JuliaSmoothOptimizers/SparseMatricesCOO.jl/pull/33/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaSmoothOptimizers#diff-c3JjL2Nvb19ta2xfd3JhcHBlci5qbA==) | `58.49% <58.49%> (ø)` | | | [src/coo\_mkl\_interface.jl](https://codecov.io/gh/JuliaSmoothOptimizers/SparseMatricesCOO.jl/pull/33/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaSmoothOptimizers#diff-c3JjL2Nvb19ta2xfaW50ZXJmYWNlLmps) | `64.00% <64.00%> (ø)` | | | [src/SparseMatricesCOO.jl](https://codecov.io/gh/JuliaSmoothOptimizers/SparseMatricesCOO.jl/pull/33/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaSmoothOptimizers#diff-c3JjL1NwYXJzZU1hdHJpY2VzQ09PLmps) | `66.66% <66.66%> (ø)` | | | [src/coo\_linalg.jl](https://codecov.io/gh/JuliaSmoothOptimizers/SparseMatricesCOO.jl/pull/33/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaSmoothOptimizers#diff-c3JjL2Nvb19saW5hbGcuamw=) | `97.05% <0.00%> (-2.11%)` | :arrow_down: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaSmoothOptimizers). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaSmoothOptimizers)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

geoffroyleconte commented 1 year ago

Hi, thanks for the PR. Could you indicate in the documentation how to use these routines, and add tests if possible? I think that the old mul! implementation is still tested.

amontoison commented 1 year ago

Hi, thanks for the PR. Could you indicate in the documentation how to use these routines, and add tests if possible? I think that the old mul! implementation is still tested.

Hi @geoffroyleconte, I didn't finished the PR but you don't need to update something to use the new routines. The mul! will dispatch to multithreaded SparseBLAS MKL routines when it can. I will add new tests and benchmarks but the current tests should use the new routines.

It will be more efficient on Intel CPUs but we could still have a significant speed-up on AMD / M1 processors because the routines are multithreaded.

geoffroyleconte commented 1 year ago

Hi, thanks for the PR. Could you indicate in the documentation how to use these routines, and add tests if possible? I think that the old mul! implementation is still tested.

Hi @geoffroyleconte, I didn't finished the PR but you don't need to update something to use the new routines. The mul! will dispatch to multithreaded SparseBLAS MKL routines when it can. I will add new tests and benchmarks but the current tests should use the new routines.

It will be more efficient on Intel CPUs but we could still have a significant speed-up on AMD / M1 processors because the routines are multithreaded.

Perfect thanks! Yes I think most of the tests use mul! instead of * so that might be why there are some coverage issues

dpo commented 1 year ago

I will add new tests and benchmarks but the current tests should use the new routines.

They should or they do?

amontoison commented 1 year ago

I will add new tests and benchmarks but the current tests should use the new routines.

They should or they do?

They do, I checked with @code_warntype.

amontoison commented 1 year ago

Time in seconds to perform 1000 matrix-vector products with matrices from the SuiteSparseMatrixCollection. It seems that the MKL routine for COO matrices is not multithreaded :( Capture d’écran du 2022-10-25 02-06-27

geoffroyleconte commented 1 year ago

It's not much faster. Maybe SparseCOO matrix-vector products cannot be multi-threaded? Even in Julia I do not know how we could use the @threads macro because we do not know the number of nonzeros in each column. Are the benchmark performed on matrices with the Symmetric wrapper? I'm not sure that this function is the fastest possible with the t call : https://github.com/JuliaSmoothOptimizers/SparseMatricesCOO.jl/blob/c81dc762eb39f3a60b58163edd72eafc20558aaf/src/coo_linalg.jl#L66-L75

amontoison commented 1 year ago

It's not much faster. Maybe SparseCOO matrix-vector products cannot be multi-threaded? Even in Julia I do not know how we could use the @threads macro because we do not know the number of nonzeros in each column. Are the benchmark performed on matrices with the Symmetric wrapper? I'm not sure that this function is the fastest possible with the t call :

https://github.com/JuliaSmoothOptimizers/SparseMatricesCOO.jl/blob/c81dc762eb39f3a60b58163edd72eafc20558aaf/src/coo_linalg.jl#L66-L75

No, I just tested with generic COO matrices. I will compare with Symmetric matrices. I think that It can be multi-threaded because Intel was able to implement multi-threaded CSC - vector products. It's only "easy" for CSR - vector products.

amontoison commented 1 year ago

We have the same performances for Symmetric matrices. benchmarks_sym_coo

dpo commented 1 year ago

Not exactly in favor of MKL. What's going on here?

dpo commented 1 year ago

I'm not sure that this function is the fastest possible with the t call :

A little meta-programming might settle that.

amontoison commented 1 year ago

Not exactly in favor of MKL. What's going on here?

I interfaced deprecated routines (coo_mv). I should try the new and generic mkl_sparse_mv. I will need Clang.jl to generate the interface.

amontoison commented 1 year ago

I interfaced the new routines, the MKL version is slower than our Julia version. benchmarks_coo_png benchmarks_sparse_matrices_png

dpo commented 1 year ago

who needs the MKL? 😃

amontoison commented 1 year ago

I close the PR. MKL is not relevant for COO format. I just added a dummy copy of our COO sparse matrix in MKLSparse.jl to easily test if the results are better in the future: https://github.com/JuliaSparse/MKLSparse.jl/pull/33/files#diff-6241561dad231b63a9c2e46e7ff91ecfb885c834a98b621364ab11d323086af7R1-R19