This is the dataset in context. I think it's helpful as there are 7 very low resource languages (ido, Breton, western Frisian,walloon,volapuk, Norwegiannynsork, aragonese) which have not been included in any other dataset.
@KennethEnevoldsen @isaac-chung @imenelydiaker
@Art3mis0707 I don't believe senti-lex is a reasonable fit for MTEB. The benchmark concerns itself with document representations and single word representation I would argue does not fall into that category.
Will sentiment analysis of words be considered under the "s2s" category?
https://huggingface.co/datasets/senti_lex
This is the dataset in context. I think it's helpful as there are 7 very low resource languages (ido, Breton, western Frisian,walloon,volapuk, Norwegiannynsork, aragonese) which have not been included in any other dataset. @KennethEnevoldsen @isaac-chung @imenelydiaker