kayzhu / LSHash

A fast Python implementation of locality sensitive hashing.
MIT License
660 stars 158 forks source link

using on word vectors #12

Open armintabari opened 8 years ago

armintabari commented 8 years ago

I want to use this on bunch of word vectors and find the similar ones.

Should I firs index all of the vectors, and query each one again to find the bucket number?

phdowling commented 8 years ago

You index all the vectors, and then use 'query' to retrieve close matches. If you need to track where your vectors come from, etc. use the 'extra_data' argument in the 'index' method.

Dicksonchin93 commented 6 years ago

what format is needed for the 'extra_data' parameter? Does this save the label for the vector and is able to output the label rather than the vector itself?