huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
https://huggingface.co/docs/datasets
Apache License 2.0
19.28k stars 2.7k forks source link

ModuleNotFoundError: No module named 'datasets.tasks' #7248

Open shoowadoo opened 4 weeks ago

shoowadoo commented 4 weeks ago

Describe the bug


ModuleNotFoundError Traceback (most recent call last) in <cell line: 1>() ----> 1 dataset = load_dataset('knowledgator/events_classification_biotech')

11 frames /usr/local/lib/python3.10/dist-packages/datasets/load.py in load_dataset(path, name, data_dir, data_files, split, cache_dir, features, download_config, download_mode, verification_mode, keep_in_memory, save_infos, revision, token, streaming, num_proc, storage_options, trust_remote_code, **config_kwargs) 2130 2131 # Create a dataset builder -> 2132 builder_instance = load_dataset_builder( 2133 path=path, 2134 name=name,

/usr/local/lib/python3.10/dist-packages/datasets/load.py in load_dataset_builder(path, name, data_dir, data_files, cache_dir, features, download_config, download_mode, revision, token, storage_options, trust_remote_code, _require_default_config_name, **config_kwargs) 1886 raise ValueError(error_msg) 1887 -> 1888 builder_cls = get_dataset_builder_class(dataset_module, dataset_name=dataset_name) 1889 # Instantiate the dataset builder 1890 builder_instance: DatasetBuilder = builder_cls(

/usr/local/lib/python3.10/dist-packages/datasets/load.py in get_dataset_builder_class(dataset_module, dataset_name) 246 dataset_module.importable_file_path 247 ) if dataset_module.importable_file_path else nullcontext(): --> 248 builder_cls = import_main_class(dataset_module.module_path) 249 if dataset_module.builder_configs_parameters.builder_configs: 250 dataset_name = dataset_name or dataset_module.builder_kwargs.get("dataset_name")

/usr/local/lib/python3.10/dist-packages/datasets/load.py in import_main_class(module_path) 167 def import_main_class(module_path) -> Optional[Type[DatasetBuilder]]: 168 """Import a module at module_path and return its main class: a DatasetBuilder""" --> 169 module = importlib.import_module(module_path) 170 # Find the main class in our imported module 171 module_main_cls = None

/usr/lib/python3.10/importlib/init.py in import_module(name, package) 124 break 125 level += 1 --> 126 return _bootstrap._gcd_import(name[level:], package, level) 127 128

/usr/lib/python3.10/importlib/_bootstrap.py in _gcd_import(name, package, level)

/usr/lib/python3.10/importlib/_bootstrap.py in _find_andload(name, import)

/usr/lib/python3.10/importlib/_bootstrap.py in _find_and_loadunlocked(name, import)

/usr/lib/python3.10/importlib/_bootstrap.py in _load_unlocked(spec)

/usr/lib/python3.10/importlib/_bootstrap_external.py in exec_module(self, module)

/usr/lib/python3.10/importlib/_bootstrap.py in _call_with_frames_removed(f, *args, **kwds)

~/.cache/huggingface/modules/datasets_modules/datasets/knowledgator--events_classification_biotech/9c8086d498c3104de3a3c5b6640837e18ccd829dcaca49f1cdffe3eb5c4a6361/events_classification_biotech.py in 1 import datasets 2 from datasets import load_dataset ----> 3 from datasets.tasks import TextClassification 4 5 DESCRIPTION = """

ModuleNotFoundError: No module named 'datasets.tasks'


NOTE: If your import is failing due to a missing package, you can manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the "Open Examples" button below.

Steps to reproduce the bug

!pip install datasets

from datasets import load_dataset

dataset = load_dataset('knowledgator/events_classification_biotech')

Expected behavior

no ModuleNotFoundError

Environment info

google colab

tibor-reiss commented 4 weeks ago

tasks was removed in v3: #6999

I also don't see why TextClassification is imported, since it's not used after. So the fix is simple: delete this line.

lhoestq commented 4 weeks ago

I opened https://huggingface.co/datasets/knowledgator/events_classification_biotech/discussions/7 to remove the line, hopefully the dataset owner will merge it soon