ekzhu / datasketch

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
https://ekzhu.github.io/datasketch
MIT License
2.53k stars 294 forks source link

I am a beginner. Do you have a case of querying picture data #169

Open With-the-sun opened 2 years ago

With-the-sun commented 2 years ago

I am a beginner and want to use LSH to quickly query images or other types of data. Ask if only text data is included in the project

With-the-sun commented 2 years ago

hi The Jaccard distance was uesd in minhash for your cases, so, does it means that feature vectors extracted from the image cannot replace datasets set1,set2 and set3 (in "datasketch-master/examples/lsh_examples.py")?

ekzhu commented 2 years ago

I am a beginner and want to use LSH to quickly query images or other types of data. Ask if only text data is included in the project

WeightMinHash can be used for images, but I think feature engineering is required to prepare the input vectors.

ekzhu commented 2 years ago

hi The Jaccard distance was uesd in minhash for your cases, so, does it means that feature vectors extracted from the image cannot replace datasets set1,set2 and set3 (in "datasketch-master/examples/lsh_examples.py")?

For MinHash in this package, it is designed to work with textual data. So it's not recommended to use image data here.

You can try WeightedMinHash instead.

branaway commented 3 weeks ago

Any skeleton code, perhaps with openCV SIFT descriptors?