fsspec / s3fs

S3 Filesystem
http://s3fs.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
892 stars 274 forks source link

Errors when installing s3fs on Sagemaker Studio #868

Closed ruvilonix closed 7 months ago

ruvilonix commented 7 months ago

On a freshly created JupyterLab Space instance (image: SageMaker Distribution 1.6) on SageMaker Studio in AWS, I open the terminal and run pip install s3fs, and it returns some errors during install:

sagemaker-user@default:~$ pip install s3fs
Collecting s3fs
  Downloading s3fs-2024.3.1-py3-none-any.whl.metadata (1.6 kB)
Requirement already satisfied: aiobotocore<3.0.0,>=2.5.4 in /opt/conda/lib/python3.10/site-packages (from s3fs) (2.12.1)
Collecting fsspec==2024.3.1 (from s3fs)
  Downloading fsspec-2024.3.1-py3-none-any.whl.metadata (6.8 kB)
Requirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /opt/conda/lib/python3.10/site-packages (from s3fs) (3.9.3)
Requirement already satisfied: botocore<1.34.52,>=1.34.41 in /opt/conda/lib/python3.10/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs) (1.34.51)
Requirement already satisfied: wrapt<2.0.0,>=1.10.10 in /opt/conda/lib/python3.10/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs) (1.16.0)
Requirement already satisfied: aioitertools<1.0.0,>=0.5.1 in /opt/conda/lib/python3.10/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs) (0.11.0)
Requirement already satisfied: aiosignal>=1.1.2 in /opt/conda/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (23.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in /opt/conda/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (1.9.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (4.0.3)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from botocore<1.34.52,>=1.34.41->aiobotocore<3.0.0,>=2.5.4->s3fs) (1.0.1)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /opt/conda/lib/python3.10/site-packages (from botocore<1.34.52,>=1.34.41->aiobotocore<3.0.0,>=2.5.4->s3fs) (2.9.0)
Requirement already satisfied: urllib3<2.1,>=1.25.4 in /opt/conda/lib/python3.10/site-packages (from botocore<1.34.52,>=1.34.41->aiobotocore<3.0.0,>=2.5.4->s3fs) (1.26.18)
Requirement already satisfied: idna>=2.0 in /opt/conda/lib/python3.10/site-packages (from yarl<2.0,>=1.0->aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (3.6)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.10/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.34.52,>=1.34.41->aiobotocore<3.0.0,>=2.5.4->s3fs) (1.16.0)
Downloading s3fs-2024.3.1-py3-none-any.whl (29 kB)
Downloading fsspec-2024.3.1-py3-none-any.whl (171 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 172.0/172.0 kB 18.4 MB/s eta 0:00:00
Installing collected packages: fsspec, s3fs
  Attempting uninstall: fsspec
    Found existing installation: fsspec 2023.6.0
    Uninstalling fsspec-2023.6.0:
      Successfully uninstalled fsspec-2023.6.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyter-ai 2.11.0 requires faiss-cpu, which is not installed.
datasets 2.18.0 requires fsspec[http]<=2024.2.0,>=2023.1.0, but you have fsspec 2024.3.1 which is incompatible.
jupyter-scheduler 2.5.1 requires fsspec==2023.6.0, but you have fsspec 2024.3.1 which is incompatible.
Successfully installed fsspec-2023.6.0 s3fs-2024.3.1

After installation, it is no longer possible to import transformers.Trainer:

TypeError                                 Traceback (most recent call last)
File [/opt/conda/lib/python3.10/site-packages/transformers/utils/import_utils.py:1099](https://cxvx4off4bxsa80.studio.us-east-1.sagemaker.aws/opt/conda/lib/python3.10/site-packages/transformers/utils/import_utils.py#line=1098), in _LazyModule._get_module(self, module_name)
   1098 try:
-> 1099     return importlib.import_module("." + module_name, self.__name__)
   1100 except Exception as e:

File [/opt/conda/lib/python3.10/importlib/__init__.py:126](https://cxvx4off4bxsa80.studio.us-east-1.sagemaker.aws/opt/conda/lib/python3.10/importlib/__init__.py#line=125), in import_module(name, package)
    125         level += 1
--> 126 return _bootstrap._gcd_import(name[level:], package, level)

File <frozen importlib._bootstrap>:1050, in _gcd_import(name, package, level)

File <frozen importlib._bootstrap>:1027, in _find_and_load(name, import_)

File <frozen importlib._bootstrap>:1006, in _find_and_load_unlocked(name, import_)

File <frozen importlib._bootstrap>:688, in _load_unlocked(spec)

File <frozen importlib._bootstrap_external>:883, in exec_module(self, module)

File <frozen importlib._bootstrap>:241, in _call_with_frames_removed(f, *args, **kwds)

File [/opt/conda/lib/python3.10/site-packages/transformers/trainer.py:162](https://cxvx4off4bxsa80.studio.us-east-1.sagemaker.aws/opt/conda/lib/python3.10/site-packages/transformers/trainer.py#line=161)
    161 if is_datasets_available():
--> 162     import datasets
    164 if is_torch_tpu_available(check_device=False):

File [/opt/conda/lib/python3.10/site-packages/datasets/__init__.py:18](https://cxvx4off4bxsa80.studio.us-east-1.sagemaker.aws/opt/conda/lib/python3.10/site-packages/datasets/__init__.py#line=17)
     16 __version__ = "2.18.0"
---> 18 from .arrow_dataset import Dataset
     19 from .arrow_reader import ReadInstruction

File [/opt/conda/lib/python3.10/site-packages/datasets/arrow_dataset.py:66](https://cxvx4off4bxsa80.studio.us-east-1.sagemaker.aws/opt/conda/lib/python3.10/site-packages/datasets/arrow_dataset.py#line=65)
     64 from tqdm.contrib.concurrent import thread_map
---> 66 from . import config
     67 from .arrow_reader import ArrowReader

File [/opt/conda/lib/python3.10/site-packages/datasets/config.py:41](https://cxvx4off4bxsa80.studio.us-east-1.sagemaker.aws/opt/conda/lib/python3.10/site-packages/datasets/config.py#line=40)
     40 DILL_VERSION = version.parse(importlib.metadata.version("dill"))
---> 41 FSSPEC_VERSION = version.parse(importlib.metadata.version("fsspec"))
     42 PANDAS_VERSION = version.parse(importlib.metadata.version("pandas"))

File [/opt/conda/lib/python3.10/site-packages/packaging/version.py:54](https://cxvx4off4bxsa80.studio.us-east-1.sagemaker.aws/opt/conda/lib/python3.10/site-packages/packaging/version.py#line=53), in parse(version)
     46 """Parse the given version string.
     47 
     48 >>> parse('1.0.dev1')
   (...)
     52 :raises InvalidVersion: When the version string is not a valid version.
     53 """
---> 54 return Version(version)

File [/opt/conda/lib/python3.10/site-packages/packaging/version.py:198](https://cxvx4off4bxsa80.studio.us-east-1.sagemaker.aws/opt/conda/lib/python3.10/site-packages/packaging/version.py#line=197), in Version.__init__(self, version)
    197 # Validate the version and parse it into pieces
--> 198 match = self._regex.search(version)
    199 if not match:

TypeError: expected string or bytes-like object

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
Cell In[1], line 1
----> 1 from transformers import Trainer

File <frozen importlib._bootstrap>:1075, in _handle_fromlist(module, fromlist, import_, recursive)

File [/opt/conda/lib/python3.10/site-packages/transformers/utils/import_utils.py:1089](https://cxvx4off4bxsa80.studio.us-east-1.sagemaker.aws/opt/conda/lib/python3.10/site-packages/transformers/utils/import_utils.py#line=1088), in _LazyModule.__getattr__(self, name)
   1087     value = self._get_module(name)
   1088 elif name in self._class_to_module.keys():
-> 1089     module = self._get_module(self._class_to_module[name])
   1090     value = getattr(module, name)
   1091 else:

File [/opt/conda/lib/python3.10/site-packages/transformers/utils/import_utils.py:1101](https://cxvx4off4bxsa80.studio.us-east-1.sagemaker.aws/opt/conda/lib/python3.10/site-packages/transformers/utils/import_utils.py#line=1100), in _LazyModule._get_module(self, module_name)
   1099     return importlib.import_module("." + module_name, self.__name__)
   1100 except Exception as e:
-> 1101     raise RuntimeError(
   1102         f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its"
   1103         f" traceback):\n{e}"
   1104     ) from e

RuntimeError: Failed to import transformers.trainer because of the following error (look up to see its traceback):
expected string or bytes-like object
martindurant commented 7 months ago

I don't know what transformers.Trainer is, but apparently is depends on a more specific version of s3fs or one of its dependencies. Unfortunately, pip does not guarantee consistency after installation, and indeed it gives you a warning to this effect. You have these choices:

I will close this now, as I don't believe there's anything we can do about it.