Open Siegi96 opened 2 years ago
I only evaluated the English CLIP model. It does not perform that well for general embeddings task.
Evaluated on 14 tasks, it achieves an avg. performance of 57.5. The paraphrase-v2 models are at 65-67 on the same tasks.
Even the model is able to handle images, performance/benchmark comparison on text datasets with other text2vec models would be interesting.