McGill-NLP / feedbackqa

FeedbackQA: Improving Question Answering Post-Deployment with Interactive Feedback
https://mcgill-nlp.github.io/feedbackqa/
9 stars 3 forks source link

error from load_dataset() #8

Open jinz2014 opened 2 months ago

jinz2014 commented 2 months ago

Hi, Can you reproduce the error ?

from datasets import load_dataset
r = load_dataset("McGill-NLP/feedbackQA")["train"]

/home/user/miniconda3/envs/trustllm/lib/python3.9/site-packages/datasets/load.py:1486: FutureWarning: The repository for McGill-NLP/feedbackQA contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/McGill-NLP/feedbackQA You can avoid this message in future by passing the argument trust_remote_code=True. Passing trust_remote_code=True will be mandatory to load this dataset from the next major release of datasets. warnings.warn( /home/user/.cache/huggingface/datasets/downloads/8c4cae661a2aae10dee7ff50c9a5c5cd83aff73715f7520ecaeab3cf555dd2f2/feedback_test.json Generating train split: 0%| | 0/5660 [00:00<?, ? examples/s] Traceback (most recent call last): File "/home/user/miniconda3/envs/trustllm/lib/python3.9/site-packages/datasets/builder.py", line 1748, in _prepare_split_single for key, record in generator: File "/home/user/.cache/huggingface/modules/datasets_modules/datasets/McGill-NLP--feedbackQA/20c8f938f417c88303bb7041cea9554c1d14667686d7d7c5dda83dd4f39e5dc4/feedbackQA.py", line 109, in _generate_examples with open(filepath, encoding="utf-8") as f: File "/home/user/miniconda3/envs/trustllm/lib/python3.9/site-packages/datasets/streaming.py", line 75, in wrapper return function(*args, download_config=download_config, *kwargs) File "/home/user/miniconda3/envs/trustllm/lib/python3.9/site-packages/datasets/utils/file_utils.py", line 1219, in xopen return open(main_hop, mode, args, **kwargs) NotADirectoryError: [Errno 20] Not a directory: '/home/user/.cache/huggingface/datasets/downloads/8c4cae661a2aae10dee7ff50c9a5c5cd83aff73715f7520ecaeab3cf555dd2f2/feedback_train.json'

cslizc commented 1 month ago

@jinz2014 Hi, could you plz try r = load_dataset("McGill-NLP/feedbackQA", trust_remote_code=True)? It's working well on my end.