Hi, I'm proposing to integrate the Tatoeba machine translation dataset into sotabench-eval. I have included code for running the tests, modeled after WMT, and for downloading and configuring the data. I'm not 100% sure how the caching is supposed to work at the moment, I'll come back to that.
Currently you can:
import sotabencheval
from sotabencheval.machine_translation import TatoebaEvaluator, TatoebaDataset
# The test data will be downloaded and unpacked under the directory "tatoeba", this only needs to be done if the data isn't already present
sotabencheval.machine_translation.tatoeba.fetch_and_configure_data("tatoeba")
evaluator = TatoebaEvaluator(dataset=TatoebaDataset.v1, source_lang="eng", target_lang="deu", local_root="tatoeba", model_name="Some model", paper_arxiv_id="Some id")
evaluator.add({1: "Tom mag die italienische Küche.", 2: "Hier wirst du viel lernen."})
print(evaluator.get_results(ignore_missing = True))
You should be able to merge this without breaking anything, but please point me towards what else needs to be done...
Hi, I'm proposing to integrate the Tatoeba machine translation dataset into sotabench-eval. I have included code for running the tests, modeled after WMT, and for downloading and configuring the data. I'm not 100% sure how the caching is supposed to work at the moment, I'll come back to that.
Currently you can:
You should be able to merge this without breaking anything, but please point me towards what else needs to be done...