embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark
https://arxiv.org/abs/2210.07316
Apache License 2.0
1.9k stars 254 forks source link

BrazilianToxicTweetsClassification broken #1324

Open Samoed opened 2 hours ago

Samoed commented 2 hours ago

When running tests, I encountered an error with BrazilianToxicTweetsClassification.

In #1300 runner logs:

FAILED tests/test_benchmark/test_benchmark_integration_with_datasets.py::test_benchmark_sentence_transformer[model0-BrazilianToxicTweetsClassification] - KeyError: "Column homophobia not in the dataset. Current columns in the dataset: ['text', 'label']"

And in #1323 runner logs:

FAILED tests/test_benchmark/test_benchmark_integration_with_datasets.py::test_benchmark_sentence_transformer[model0-task1] - FileNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

I suggest changing the "integration" tasks in tests to tasks with datasets from the MTEB organization, as it would allow us better control over them

Samoed commented 2 hours ago

I think that there are some issues in github, because my main page is not loading, but on githubstatus everything is fine