Open geekchen007 opened 10 months ago
LAION-2B English
There may be a little bit of Chinese in that?
But are you sure it really perform well in Chinese?
You can run the multilingual benchmark with some multilingual models to compare
I did not use a particularly strict dataset for comparison, but I did search for common Chinese words such as "cat", "dog", "dance", "red clothes", and the top-k result was correct.
Why laion2b_e16(ViT-B-32::laion2b_e16) does perform well in Chinese/English search? e.g. "猫/cat" "狗/dog",but performs poorly in Japanese or French What is the composition of the dataset for model training? A CLIP ViT-B/32 model trained with the LAION-2B English subset of LAION-5B?laion2B-multi-chinese-subset?or other?