mlfoundations / datacomp

DataComp: In search of the next generation of multimodal datasets
http://datacomp.ai/
Other
642 stars 54 forks source link

Text search over CommonPool #46

Closed sedol1339 closed 1 year ago

sedol1339 commented 1 year ago

Is there any web UI for performing text search over CommonPool? I want to use if to collect my own dataset for few-shot vision tasks.

(I know there is already a search engine over LAION-5B, but I need the search not to be model-assisted to avoid model bias. In LAION-5B, this is not the case, because it searchs by CLIP image embeddings, and LAION-5B itself is model-filtered. So searcing over CommonPool seems better solution).

To implement this by myself, I guess I need some search engine. Would you recommend Apache Solr or other? Any common practices here?

rom1504 commented 1 year ago

Depending on your goals, maybe the best way is to build an index with clip-retrieval over the provided clip embeddings

On Sun, Aug 13, 2023, 11:30 Oleg Sedukhin @.***> wrote:

Is there any web UI for performing text search over CommonPool? I want to use if to collect my own dataset for few-shot vision tasks.

I guess I need some search engine to implement this by myself. Would you recommend Apache Solr or other? Any common practices here?

— Reply to this email directly, view it on GitHub https://github.com/mlfoundations/datacomp/issues/46, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437QPKGZXKP4MNHMS4UTXVCNEDANCNFSM6AAAAAA3OQYIPY . You are receiving this because you are subscribed to this thread.Message ID: @.***>