Generating train split: 0 examples [00:00, ? examples/s]
Traceback (most recent call last):
File "/home/yibop/miniconda3/envs/unlimiformer/lib/python3.10/site-packages/datasets/builder.py", line 1750, in _prepare_split_single
for key, record in generator:
File "/home/yibop/.cache/huggingface/modules/datasets_modules/datasets/tau--sled/b5fab54723c8a515071f8b983dcb93519ae71beced5ad96f722cd22d91047229/sled.py", line 609, in _generate_examples
for key, row in gen:
File "/home/yibop/.cache/huggingface/modules/datasets_modules/datasets/tau--sled/b5fab54723c8a515071f8b983dcb93519ae71beced5ad96f722cd22d91047229/sled.py", line 520, in _scrolls_gen
with open(data_file, encoding="utf-8") as f:
File "/home/yibop/miniconda3/envs/unlimiformer/lib/python3.10/site-packages/datasets/streaming.py", line 75, in wrapper
return function(*args, download_config=download_config, **kwargs)
File "/home/yibop/miniconda3/envs/unlimiformer/lib/python3.10/site-packages/datasets/utils/file_utils.py", line 1222, in xopen
return open(main_hop, mode, *args, **kwargs)
NotADirectoryError: [Errno 20] Not a directory: '/home/yibop/.cache/huggingface/datasets/downloads/73eca96a974f65c46cdf67acc0d23b976b9c57ce310d35ad7cfda8b6dc67001d/gov_report/train.jsonl'
Traceback (most recent call last):
File "/home/yibop/yibop/unlimiformer/src/run.py", line 1180, in <module>
main()
File "/home/yibop/yibop/unlimiformer/src/run.py", line 437, in main
seq2seq_dataset = _get_dataset(data_args, model_args, training_args)
File "/home/yibop/yibop/unlimiformer/src/run.py", line 943, in _get_dataset
seq2seq_dataset = load_dataset(
File "/home/yibop/miniconda3/envs/unlimiformer/lib/python3.10/site-packages/datasets/load.py", line 2616, in load_dataset
builder_instance.download_and_prepare(
File "/home/miniconda3/envs/unlimiformer/lib/python3.10/site-packages/datasets/builder.py", line 1029, in download_and_prepare
self._download_and_prepare(
File "/home/miniconda3/envs/unlimiformer/lib/python3.10/site-packages/datasets/builder.py", line 1791, in _download_and_prepare
super()._download_and_prepare(
File "/home/miniconda3/envs/unlimiformer/lib/python3.10/site-packages/datasets/builder.py", line 1124, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/home/miniconda3/envs/unlimiformer/lib/python3.10/siteables/datasets/builder.py", line 1629, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File "/home/miniconda3/envs/unlimiformer/lib python3.10/site-packages/datasets/builder.py", line 1786, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset
I tried the common solution found online: deleting the entire dataset cache directory
/home/.cache/huggingface/datasets
but it still doesn't work. Any ideas on what could be causing this issue?
Hi,
Thank you for this great effort.
When attempting to reproduce the results, I ran the following command:
I encountered the following error:
I tried the common solution found online: deleting the entire dataset cache directory
but it still doesn't work. Any ideas on what could be causing this issue?