rom1504 / clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them
https://rom1504.github.io/clip-retrieval/
MIT License
2.42k stars 213 forks source link

Find certain image in the dataset #165

Closed D0miH closed 2 years ago

D0miH commented 2 years ago

Hi @rom1504,

first of all thanks for this awesome project. Comes in really handy :)

The problem I have is the following: I have an image and I want to check if it is occuring in the LAION-400M dataset. Using the frontend of your project I see that only similar images but not exact matches are returned. However, it would be cool to instead of only returning the KNNs to get the exact match. I figure that in this case the cosine similarity would be 1.

From reading the docs I couldn't find an option to do this using ClipClient. Do you maybe have any idea how I could perform exact image search?

Thanks in advance and best, Dominik

rom1504 commented 2 years ago

Hello,

This is based on embeddings and approximate knn. So the similarity won't be 1 You may disable aesthetic scoring and other options like safety and violence filtering to increase the chance to find an exact match

D0miH commented 2 years ago

Thanks for the fast answer. Disabling aesthetic scoring did the trick. I am now able to find the same image (if it exists).

Thanks for your help 👍🏼