huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
https://huggingface.co/docs/datasets
Apache License 2.0
19.24k stars 2.69k forks source link

AttributeError: 'CommunityDatasetModuleFactoryWithoutScript' object has no attribute 'path' #3331

Closed luozhouyang closed 2 years ago

luozhouyang commented 2 years ago

Describe the bug

I add a new question answering dataset to huggingface datasets manually. Here is the link: luozhouyang/question-answering-datasets

But when I load the dataset, an error raised:

AttributeError: 'CommunityDatasetModuleFactoryWithoutScript' object has no attribute 'path'

Steps to reproduce the bug

from datasets import load_dataset

dataset = load_dataset("luozhouyang/question-answering-datasets", data_files=["dureader_robust.train.json"])

Expected results

Load dataset successfully without any error.

Actual results

Traceback (most recent call last):
  File "/mnt/home/zhouyang.lzy/github/naivenlp/naivenlp/tests/question_answering_tests/dataset_test.py", line 89, in test_load_dataset_with_hf
    data_files=["dureader_robust.train.json"],
  File "/mnt/home/zhouyang.lzy/.conda/envs/naivenlp/lib/python3.6/site-packages/datasets/load.py", line 1616, in load_dataset
    **config_kwargs,
  File "/mnt/home/zhouyang.lzy/.conda/envs/naivenlp/lib/python3.6/site-packages/datasets/load.py", line 1443, in load_dataset_builder
    path, revision=revision, download_config=download_config, download_mode=download_mode, data_files=data_files
  File "/mnt/home/zhouyang.lzy/.conda/envs/naivenlp/lib/python3.6/site-packages/datasets/load.py", line 1157, in dataset_module_factory
    raise e1 from None
  File "/mnt/home/zhouyang.lzy/.conda/envs/naivenlp/lib/python3.6/site-packages/datasets/load.py", line 1144, in dataset_module_factory
    download_mode=download_mode,
  File "/mnt/home/zhouyang.lzy/.conda/envs/naivenlp/lib/python3.6/site-packages/datasets/load.py", line 798, in get_module
    raise FileNotFoundError(f"No data files or dataset script found in {self.path}")
AttributeError: 'CommunityDatasetModuleFactoryWithoutScript' object has no attribute 'path'

Environment info

mariosasko commented 2 years ago

Hi,

the fix was merged and will be available in the next release of datasets. In the meantime, you can use it by installing datasets directly from master as follows:

pip install git+https://github.com/huggingface/datasets.git