EMS-TU-Ilmenau / fastmat

A library to build up lazily evaluated expressions of linear transforms for efficient scientific computing.
https://fastmat.readthedocs.io
Apache License 2.0
24 stars 8 forks source link

returned vector shape of getCol() and getRow() inconsistent #32

Closed ChristophWWagner closed 6 years ago

ChristophWWagner commented 6 years ago

The vector orientation of the getCol() and getRow() return vector shapes ((N, 1) or (1, N)) is inconsistent and causes trouble when fed directly into .forward()

MWP

>>> from fastmat import *
>>> from numpy import *
>>> import scipy.sparse as sps
>>> matS = Sparse(sps.dia_matrix((arange(10), 0), (10, 10)))
>>> matP = matS * matS
>>> matP.getRow(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "fastmat/Matrix.pyx", line 338, in fastmat.Matrix.Matrix.getRow
  File "fastmat/Product.pyx", line 160, in fastmat.Product.Product._getRow
  File "fastmat/Matrix.pyx", line 1006, in fastmat.Matrix.Matrix.backward
  File "fastmat/Matrix.pyx", line 1032, in fastmat.Matrix.Matrix.backward
ValueError: Dimension mismatch (10, 10).H <-!-> (1, 10)
>>> matP.getCol(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "fastmat/Matrix.pyx", line 296, in fastmat.Matrix.Matrix.getCol
  File "fastmat/Product.pyx", line 140, in fastmat.Product.Product._getCol
  File "fastmat/Matrix.pyx", line 913, in fastmat.Matrix.Matrix.forward
  File "fastmat/Matrix.pyx", line 937, in fastmat.Matrix.Matrix.forward
ValueError: Dimension mismatch (10, 10) <-!-> (1, 10)

Known classes failing Sparse Product not tested for further violations in other classes

What happens and what goes wrong Product._getCol(idx) calls X.getCol(idx)where X is the rightmost product term in the product. It then repeatedly iterates X = Y.forward(X) for all the remaining terms in reverse order. This assumes X.getCol(idx) to return a (N, 1) or (N, ) shaped vector. In this instance the implementation of Sparse._getCol(idx) and Sparse._getRow(idx) yields a vector of shape (1, N) or (N, 1), depending on the type of the embedded scipy sparse matrix in `Sparse.

What should be different All Matrix._getCol(idx) and Matrix._getRow(idx) routines shall yield a vector of (N, ) shape and the class tests for getColand getRow need to enforce this constraint. This issue partly arises from the test scripts currently not being able to detect cases where the vector shape deflects from this.

ChristophWWagner commented 6 years ago

Seems like .todense() of any scipy sparse matrix returns a numpy.matrix object, of which .squeeze() always has two dimensions. Using spsparsematrix.toarray() resolves this by returning a ndarray.