pixelogik / NearPy

Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive hashes.
MIT License
763 stars 151 forks source link

Sparse matrix support #48

Closed viktorpi closed 6 years ago

viktorpi commented 8 years ago

Hi, Readme says that sparse vectors are currently support. Is that true? I've just tries recent version from github and getting ValueError: dimension mismatch when it call a hash_vector function. It can be just my misunderstanding how to work with this library. I can provide a code where I reproduced the error.

amorgun commented 8 years ago

Yes they should work. There is test for it.

pixelogik commented 8 years ago

@isendel Yes that should work. Please provide the code here and we can investigate.

viktorpi commented 8 years ago

@amorgun, @pixelogik Thank you for quick response. I found out the root cause of my error. I was trying to store a vector that is not in the shape that Engine accepts. In my case I have a vectors with 2^18 elements each in CSR format. I specified the Engine dimension as 2^18 and was storing vectors shaped like (1, 2^18). That was my fault. Looks like the only axis that Engine is supposed to accept is column-vectors. Now when I pass vectors shaped like (2^18, 1) it works without errors. I can't say that it is really convenient for me since it requires a data transformations before I pass them to the engine. Do you recommend to follow some different approach for adding row vectors or that is completely contradicts library design?

posix4e commented 6 years ago

Can we close this?