pixelogik / NearPy

Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive hashes.
MIT License
763 stars 151 forks source link

Finish / polish support for sparse vectors in storage adapters. #1

Closed pixelogik closed 10 years ago

rudaoshi commented 10 years ago

I want to know whether the sparse vector support has finished yet.

pixelogik commented 10 years ago

I have not touched the branch feature/hashsaving for a while. It only has code to save vectors in sparse format in redis. It is working but I did not like the performance back then.

I know this is no help right now but I will work on NearPy in the near future again to add some stuff and this ticket will be part of it.

pixelogik commented 10 years ago

NearPy now supports sparse vectors as supported by scipy. For computations it uses CSR format, for redis storage it uses COO.

The vectors have to be of shape (n, 1), where n is your dimension. So if you have vectors of this kind you can use the existing NearPy engine without any changes.