Closed anjor closed 3 months ago
There seems to be a PR related to the load_dataset path that went into 2.21.0 -- https://github.com/huggingface/datasets/pull/6862/files
Taking a look at it now
+1
Downgrading to 2.20.0 fixed my issue, hopefully helpful for others.
I tried adding a simple test to test_load.py
with the alpaca eval dataset but the test didn't fail :(.
So looks like this might have something to do with the environment?
There was an issue with the script of the "tatsu-lab/alpaca_eval" dataset.
I was fixed with this PR:
It should work now if you retry to load the dataset.
Describe the bug
eval_set = datasets.load_dataset("tatsu-lab/alpaca_eval", "alpaca_eval_gpt4_baseline", trust_remote_code=True)
used to work till 2.20.0 but doesn't work in 2.21.0In 2.20.0:
in 2.21.0:
Steps to reproduce the bug
pip install datasets==2.21.0
import datasets
eval_set = datasets.load_dataset("tatsu-lab/alpaca_eval", "alpaca_eval_gpt4_baseline", trust_remote_code=True)
Expected behavior
Try steps 1-5 again but replace datasets version with 2.20.0, it will work
Environment info
datasets
version: 2.21.0huggingface_hub
version: 0.23.5fsspec
version: 2024.5.0