The paper states that 700k images were used from Farfetch in FashionCLIP 1.0.
But your Hugging Face page says
We postulate that the perofrmance gains afforded by laion/CLIP-ViT-B-32-laion2B-s34B-b79K are due to the increased training data (5x OpenAI CLIP data).
What kind of data was used? Was it everything or just fashion related just like 1.0? Was it still Farfetch or something else? Would be great to explain a bit more on how you got 5x dataset.
the pre-traiend model we use is laion/CLIP-ViT-B-32-laion2B-s34B-b79K, you can find more information about it on the open clip repository. we belive that in the 2B images seen by this model there is more fashion
Hello,
The paper states that 700k images were used from Farfetch in FashionCLIP 1.0. But your Hugging Face page says
What kind of data was used? Was it everything or just fashion related just like 1.0? Was it still Farfetch or something else? Would be great to explain a bit more on how you got 5x dataset.