Open Muennighoff opened 5 months ago
I know some of these are already covered, and some of them I can't seem to find (dan_news_summ_test). Do we have more references on these?
For convenience here is a list (I have not checked all in this list):
(@imenelydiaker can you have a look at these)
Lots of multilingual datasets listed here https://docs.google.com/spreadsheets/d/1qf0iYejG-9RgEEi13qB_SK_178-eNaeJDmSDNSj260A/edit?gid=1875159366#gid=1875159366 from https://blog.voyageai.com/2024/06/10/voyage-multilingual-2-multilingual-embedding-model/ ; I imagine some of them are not in MTEB yet; would be great to have them 🙌