pixelogik / NearPy

Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive hashes.
MIT License
759 stars 152 forks source link

Option to normalize or not #90

Open samlobel opened 4 years ago

samlobel commented 4 years ago

For certain projections at least (RandomBinary for example), it doesn't seem to make a difference whether you normalize the elements for where they hash. But if you care about the actual distance in real space, it appears difficult to recover this since the vectors are stored normalized.

I admit I don't understand LSH at a deep level -- would it break the math to make the unitvec part an option?