kayzhu / LSHash

A fast Python implementation of locality sensitive hashing.
MIT License
660 stars 158 forks source link

Is it possible to query results based on threshold. #26

Open tom-riddle opened 5 years ago

tom-riddle commented 5 years ago

Many LSH implementations use Jaccard similarity to return matching result above a certain threshold say 80% match. Is possible to implement the same in this library.

Riroaki commented 5 years ago

Same thoughts. And I think zipping arrays into lower dimensions and using smaller input_dim may help, as smaller dimensions increases the probability of collision, and therefore similar vectors are more likely fall into a same slot.

Riroaki commented 5 years ago

Same thoughts. And I think zipping arrays into lower dimensions and using smaller input_dim may help, as smaller dimensions increases the probability of collision, and therefore similar vectors are more likely fall into a same slot.

Sorry, I mean hash_size..

AmeerahAlshahrani commented 1 year ago

Many LSH implementations use Jaccard similarity to return matching result above a certain threshold say 80% match. Is possible to implement the same in this library.

How to use threshold with this LSH implementation can you help me I have problem with this issue