pixelogik / NearPy

Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive hashes.
MIT License
759 stars 152 forks source link

Struggle in sparse vector #36

Closed dangchienhsgs closed 8 years ago

dangchienhsgs commented 8 years ago

I used csr_matrix in scipy library to represent sparse vector, but it can not suit for store_vector function of our library because a vector in scipy still be a 1xn matrix, which give the dimension =1 will be mismatch with our lshash. Could you give me an example or let me know the suitable library for computing with sparse vector and can be applied to our library.

pixelogik commented 8 years ago

Hi,

sorry for the late reply but I currently have no time to research that. There is a test in NearPy for storing sparse vectors. Maybe that helps:

def test_hash_deterministic_sparse(self): x = scipy.sparse.rand(100, 1, density=0.1) first_hash = self.rbp.hash_vector(x)[0] for k in range(100): self.assertEqual(first_hash, self.rbp.hash_vector(x)[0])