How to map LAION-5B's image to pre-computed CLIP embedding?

rom1504 / clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them

https://rom1504.github.io/clip-retrieval/

MIT License

2.42k stars 213 forks source link

How to map LAION-5B's image to pre-computed CLIP embedding? #268

Closed LeoXing1996 closed 1 year ago

LeoXing1996 commented 1 year ago

I have already downloaded the LAION 5B dataset from https://huggingface.co/datasets/laion/laion2B-en, and now I want to use the pre-computed CLIP embedding in https://huggingface.co/datasets/laion/laion2b-en-vit-l-14-embeddings/tree/main.

However, I found the metadata (or image order?) in those two repos are mismatched. How can we map the clip embedding to the downloaded laion2b-en dataset?

rom1504 commented 1 year ago

it's in the same order as the metadata there https://huggingface.co/datasets/laion/laion2b-en-vit-l-14-embeddings/tree/main/metadata

LeoXing1996 commented 1 year ago

@rom1504, thanks for your answer!

I apologize if my question was unclear. I am seeking guidance on how to align the metadata for CLIP embeddings with the metadata for LAION-2b-en. e.g.

Can you provide any assistance with this matter?

rom1504 commented 1 year ago

What is the purpose of doing that?

The metadata next to the embeddings is also the laion2B metadata but in a different order

LeoXing1996 commented 1 year ago

@rom1504, I have already downloaded the LAION-2b-en and converted it to webdataset's style.

If I want to load the pre-computed CLIP embedding with the image during training, one good practice is to resort the CLIP embedding file and make it align with the images in tar files.

rom1504 commented 1 year ago

Ok see https://github.com/lucidrains/DALLE2-pytorch#decoder-image-embedding-dataset

And in particular https://github.com/Veldrovive/embedding-dataset-reordering

On Sun, May 14, 2023, 09:53 LeoXing1996 @.***> wrote:

@rom1504 https://github.com/rom1504, I have already downloaded the LAION-2b-en and converted it to webdataset's style.

If I want to load the pre-computed CLIP embedding with the image during training, one good practice is to resort the CLIP embedding file and make it align with the images in tar files.

— Reply to this email directly, view it on GitHub https://github.com/rom1504/clip-retrieval/issues/268#issuecomment-1546833856, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437UVNXBSBHYTRVYRLH3XGCFP3ANCNFSM6AAAAAAX7SM43M . You are receiving this because you were mentioned.Message ID: @.***>

DJLee68 commented 7 months ago

did u solve this? @LeoXing1996

LeoXing1996 commented 7 months ago

did u solve this? @LeoXing1996

@DJLee68, no, I recalculate all embeddings by myself 😢

DJLee68 commented 7 months ago

did u solve this? @LeoXing1996

@DJLee68, no, I recalculate all embeddings by myself 😢

I'm having the same problem.

Ok see https://github.com/lucidrains/DALLE2-pytorch#decoder-image-embedding-dataset

And in particular https://github.com/Veldrovive/embedding-dataset-reordering

I tried using these repos from above which @rom1504 mentioned, but they didn't work for us.

so did u match whole embedding of laion-2b with laion-2b img webdataset using meta data of laion-2b?