Open Muennighoff opened 3 weeks ago
This error seems to be associated with an older version of datasets
. My setup has datasets==2.20.0
and I'm not able to reproduce this error. What version of datasets
are you running?
I had 3.0.2
; you're right that it works now!
Hmm means we'd have to pin a lower version in reqs https://github.com/embeddings-benchmark/mteb/blob/3a18fbdafa25696080fc4fa18c1875f64d6a4010/pyproject.toml#L28 or we fix the dataset somehow; Would probably be better to do the latter (and maybe reupload the fixed one to mteb on the hub & point to that instead) 🤔
Related to #1363
I would probably fix the dataset since pinning datasets seems like a bad long-term solution.
A short-term solution is to add a warning specific to the dataset.
Seems like the same issue exists with this dataset (part of MMTEB):
File "/data/huggingface/modules/datasets_modules/datasets/SpellOnYou--kor_sarcasm/00d38c200d4d563ed94efb9ff4ca119ded94fe3cdf1e381ed95274de0a9d59f0/kor_sarcasm.py", line 21, in <module>
from datasets.tasks import TextClassification
ModuleNotFoundError: No module named 'datasets.tasks'
@Muennighoff I believe that was fixed in #1363