huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
https://huggingface.co/docs/datasets
Apache License 2.0
19.19k stars 2.68k forks source link

Cannot import load_dataset on Colab #2695

Closed bayartsogt-ya closed 3 years ago

bayartsogt-ya commented 3 years ago

Describe the bug

Got tqdm concurrent module not found error during importing load_dataset from datasets.

Steps to reproduce the bug

Here colab notebook to reproduce the error

On colab:

!pip install datasets
from datasets import load_dataset

Expected results

Works without error

Actual results

Specify the actual results or traceback.

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-2-8cc7de4c69eb> in <module>()
----> 1 from datasets import load_dataset, load_metric, Metric, MetricInfo, Features, Value
      2 from sklearn.metrics import mean_squared_error

/usr/local/lib/python3.7/dist-packages/datasets/__init__.py in <module>()
     31     )
     32 
---> 33 from .arrow_dataset import Dataset, concatenate_datasets
     34 from .arrow_reader import ArrowReader, ReadInstruction
     35 from .arrow_writer import ArrowWriter

/usr/local/lib/python3.7/dist-packages/datasets/arrow_dataset.py in <module>()
     40 from tqdm.auto import tqdm
     41 
---> 42 from datasets.tasks.text_classification import TextClassification
     43 
     44 from . import config, utils

/usr/local/lib/python3.7/dist-packages/datasets/tasks/__init__.py in <module>()
      1 from typing import Optional
      2 
----> 3 from ..utils.logging import get_logger
      4 from .automatic_speech_recognition import AutomaticSpeechRecognition
      5 from .base import TaskTemplate

/usr/local/lib/python3.7/dist-packages/datasets/utils/__init__.py in <module>()
     19 
     20 from . import logging
---> 21 from .download_manager import DownloadManager, GenerateMode
     22 from .file_utils import DownloadConfig, cached_path, hf_bucket_url, is_remote_url, temp_seed
     23 from .mock_download_manager import MockDownloadManager

/usr/local/lib/python3.7/dist-packages/datasets/utils/download_manager.py in <module>()
     24 
     25 from .. import config
---> 26 from .file_utils import (
     27     DownloadConfig,
     28     cached_path,

/usr/local/lib/python3.7/dist-packages/datasets/utils/file_utils.py in <module>()
     25 import posixpath
     26 import requests
---> 27 from tqdm.contrib.concurrent import thread_map
     28 
     29 from .. import __version__, config, utils

ModuleNotFoundError: No module named 'tqdm.contrib.concurrent'

Environment info

phosseini commented 3 years ago

I'm facing the same issue on Colab today too.

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-4-5833ac0f5437> in <module>()
      3 
      4 from ray import tune
----> 5 from datasets import DatasetDict, Dataset
      6 from datasets import load_dataset, load_metric
      7 from dataclasses import dataclass

7 frames
/usr/local/lib/python3.7/dist-packages/datasets/utils/file_utils.py in <module>()
     25 import posixpath
     26 import requests
---> 27 from tqdm.contrib.concurrent import thread_map
     28 
     29 from .. import __version__, config, utils

ModuleNotFoundError: No module named 'tqdm.contrib.concurrent'

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------
bayartsogt-ya commented 3 years ago

@phosseini I think it is related to 1.10.0 release done 3 hours ago. (cc: @lhoestq ) For now I just downgraded to 1.9.0 and it is working fine.

phosseini commented 3 years ago

@phosseini I think it is related to 1.10.0 release done 3 hours ago. (cc: @lhoestq ) For now I just downgraded to 1.9.0 and it is working fine.

Same here, downgraded to 1.9.0 for now and works fine.

mariosasko commented 3 years ago

Hi,

updating tqdm to the newest version resolves the issue for me. You can do this as follows in Colab:

!pip install tqdm --upgrade
albertvillanova commented 3 years ago

Hi @bayartsogt-ya and @phosseini, thanks for reporting.

We are fixing this critical issue and making an urgent patch release of the datasets library today.

In the meantime, as pointed out by @mariosasko, you can circumvent this issue by updating the tqdm library:

!pip install -U tqdm