Clarification on 5x dataset compared to OpenCLIP

patrickjohncyh / fashion-clip

FashionCLIP is a CLIP-like model fine-tuned for the fashion domain.

MIT License

293 stars 34 forks source link

Clarification on 5x dataset compared to OpenCLIP #11

Closed shivamtundele closed 1 year ago

shivamtundele commented 1 year ago

Hello,

The paper states that 700k images were used from Farfetch in FashionCLIP 1.0. But your Hugging Face page says

We postulate that the perofrmance gains afforded by laion/CLIP-ViT-B-32-laion2B-s34B-b79K are due to the increased training data (5x OpenAI CLIP data).

What kind of data was used? Was it everything or just fashion related just like 1.0? Was it still Farfetch or something else? Would be great to explain a bit more on how you got 5x dataset.

vinid commented 1 year ago

Hi!

the pre-traiend model we use is laion/CLIP-ViT-B-32-laion2B-s34B-b79K, you can find more information about it on the open clip repository. we belive that in the 2B images seen by this model there is more fashion
the fine tuning data is still farfetch