Open aretius opened 5 days ago
Hi @aretius , there are a few differences. The main differences were the data sets. FashionSigLIP was trained with a smaller but richer dataset. This also changed how the loss was done between the two models. We used 7 text fields and 1 image field so the loss was all combinations of img and text as well as the mean of the text vectors (i.e. fused).
Got it so the dataset and loss was different. Did you also evaluate both models on text/image based retrieval? If yes what difference did you notice?
Hello Thanks for sharing the e-commerce embedding model thats beating SOTA by a nice margin. I wanted to understand how are the e-commerce embeddings different from Marqo Siglip trained earlier from a metric standpoint. Did you ever do a comparison on same data for retrieval/search?