vincent-maillou / qttools

Quantum Transport Algorithms Toolbox
GNU General Public License v3.0
5 stars 1 forks source link

DSBSparse: Addressing some performance issues #45

Closed vetschn closed 3 weeks ago

vetschn commented 4 weeks ago

Several things are still done naively in the DSBSparse matrices.

codecov-commenter commented 4 weeks ago

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 76.95312% with 59 lines in your changes missing coverage. Please review.

Project coverage is 84.79%. Comparing base (f431742) to head (869b3ee).

Files with missing lines Patch % Lines
src/qttools/__init__.py 36.66% 19 Missing :warning:
src/qttools/datastructures/dsbcoo.py 80.64% 12 Missing :warning:
src/qttools/datastructures/dsbcsr.py 82.08% 12 Missing :warning:
src/qttools/datastructures/dsbsparse.py 84.61% 10 Missing :warning:
src/qttools/utils/mpi_utils.py 55.55% 4 Missing :warning:
src/qttools/utils/stack_utils.py 0.00% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## dev #45 +/- ## ========================================== + Coverage 83.65% 84.79% +1.13% ========================================== Files 29 29 Lines 1083 1118 +35 ========================================== + Hits 906 948 +42 + Misses 177 170 -7 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

vetschn commented 4 weeks ago

I don't know what format to return the sparse blocks as. I see a couple of options:

  1. We return an ndarray of COO/CSR matrices. This is kind of what is done in QuaTrEx currently. This unnecessarily duplicates the rows/cols and we would need to split up the data buffer.
  2. My preferred option would be to just return a tuple (cols, rowptr, data)/(rows, cols, data). Simple, however leaves the one calling the API to deal with doing something sensible with these three arrays.
  3. Return a new datastructure. Should be some sort of distributed stack of COO/CSR matrices (DSCOO/DSCSR?). Probably the cleanest solution but the most work.

Edit: Opted for the second option now.

vetschn commented 3 weeks ago

So i added these vectorized getters and setters now in the DSBCOO class (DSBCSR is next). The behavior is like this:

Note that everything here also works when accessing only a part of the stack, so random things like

inds = np.arange(10)
values = dsbcoo.stack[:3][inds, inds]
dsbcoo.stack[[2, 4, 6]][inds, inds] = 1.5 * values

should work as well in both distribution states.