Open stephantul opened 2 weeks ago
The intended use is:
import mteb
from mteb.benchmarks import MTEB_MAIN_EN
evaluation = mteb.MTEB(tasks=MTEB_MAIN_EN)
evaluation.run(embedder, eval_splits=["test"], output_folder=f"results/{name}")
And yes this results in some deprecation warnings as some of the datasets now have been updated (e.g. to be run faster). All the tasks are still runable. We will soon release a v2, zero-shot, and notably faster version of the English benchmark. This will use the updated tasks. However, we will maintain support for MTEB v1.
To make this slightly easier I have updated the interface in #1208, once merged is should be (though the old approach will still work).
import mteb
benchmark = mteb.get_benchmark("MTEB(eng)")
evaluation = mteb.MTEB(tasks=benchmark)
evaluation.run(embedder, eval_splits=["test"], output_folder=f"results/{name}")
Edit: this PR also ensure that it only uses the English language, which means you don't need to task_langs=["en"]
.
Hello,
We are interested in evaluating MTEB only on English (our model is neither multi- nor cross-lingual). That is, for multi-lingual datasets, we would only like to select the
"eng"
subset, and for cross-lingual datasets, we would only like to evaluation on"en-en"
.This used to be possible, in a roundabout way, like so:
Removing crosslingual and multilingual tasks doesn't work, as this removes entire tasks, not subsets of tasks.
But this gets me a bunch of
DeprecationWarning
s now. Is there a nice way to replicate the behavior of this piece of code? I think it would be nice to still have a good way of running just the English MTEB tasks, without having to run on a lot of languages that won't work anyway.