embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark
https://arxiv.org/abs/2210.07316
Apache License 2.0
1.98k stars 276 forks source link

Leaderboard: Fix code benchmarks #1439

Closed x-tabdeveloping closed 1 week ago

x-tabdeveloping commented 2 weeks ago

Currently CoIR and MTEB(Code) do not work with the new leaderboard, as they contain a completely different set of languages from the rest, and this confuses the Gradio elements that have been told that the only list of choices is the languages that are present in the Multilingual benchmark.

This issue is also a question on whether this is something we would like to have on the main leaderboard. @KennethEnevoldsen @Muennighoff @isaac-chung

isaac-chung commented 2 weeks ago

I think we should support them as well. Can we aggregate all the languages across all benchmarks, or is it more complicated than that?

x-tabdeveloping commented 1 week ago

I'll look into it.

x-tabdeveloping commented 1 week ago

Turns out it was pretty easy to fix: #1441