CSCMatrix OpMulMatrix uses O(numRows) memory, which is too much for some applications

scalanlp / breeze

Breeze is/was a numerical processing library for Scala.

https://www.scalanlp.org

Apache License 2.0

3.45k stars 693 forks source link

CSCMatrix OpMulMatrix uses O(numRows) memory, which is too much for some applications #767

Open darkjh opened 4 years ago

darkjh commented 4 years ago

In the v1.0 release the OpMulMatirx impl for CSCMatrix has changed. https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/linalg/operators/CSCMatrixOps.scala#L725

In the multiplication a dense array is allocated. For large sparse matrix (which csc matrix is designed for) this would not work ...

dlwh commented 4 years ago

Hi @darkjh , so almost every sparse matrix multiply algorithm I know about uses O(numRows) (or equivalent) temporary memory for doing a matrix multiply. for example, here's scipy: https://github.com/scipy/scipy/blob/f2ec91c4908f9d67b5445fbfacce7f47518b35d1/scipy/sparse/sparsetools/csr.h#L533

And here's CSparse (which powers matlab sparse routines, IIRC): https://people.sc.fsu.edu/~jburkardt/c_src/csparse/csparse.c (cs_multiply).

Can you say a bit more about your use case? I can look into doing something with blocks or something

darkjh commented 4 years ago

@dlwh AFAIK we can go sparse only in one direction, for CSC is the row, not column. We use CSC for our ML algorithms and each CSC contains one partition of our dataset. Each column represents a feature vector which is very sparse as we use hashing to handle all the features. Basically our CSC is P x N, with N being the partition size and P a very large number, say Int.maxValue.

dlwh commented 3 years ago

sorry for being so slow on this. I did look into fixing this, but it got super tricky and I abandoned it

darkjh commented 3 years ago

@dlwh Hi, np! Can you share some insights of the potential fix? Some links? Maybe I can give some brain power into this issue.

dlwh commented 3 years ago

The right approach to doing this (i think) is to make the buffer be min(numRows, BLOCK_SIZE) and loop over columns multiple times if necessary. There was something that tripped me up (it's been a while and I forget exactly what), but it was some mix of painful and nonobvious and I put it down and never picked it back up

On Sun, Feb 14, 2021 at 4:18 AM Han Ju notifications@github.com wrote:

@dlwh https://github.com/dlwh Hi, np! Can you share some insights of the potential fix? Some links? Maybe I can give some brain power into this issue.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/scalanlp/breeze/issues/767#issuecomment-778769781, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAACLIPFHLGWF6XXI7CBATLS665QJANCNFSM4J3CVQDQ .