conda-forge / filesystem-spec-feedstock

A conda-smithy repository for filesystem-spec.
BSD 3-Clause "New" or "Revised" License
2 stars 9 forks source link

Ancient s3fs package being provided by solver #79

Open maresb opened 1 year ago

maresb commented 1 year ago

Solution to issue cannot be found in the documentation.

Issue

I was having some really bizarre errors with fsspec + s3fs. After a very long investigation I tracked it down to the version (0.4.2) of s3fs which was being installed.

The reason for the ancient version of s3fs seems to be as follows:

For example,

micromamba create -n asdf -c conda-forge "boto3>1.26.76,<=1.26.132" s3fs

results in s3fs v0.4.2. (Note: 1.26.132 is simply the current version of boto3, so the upper bound is only included for reproducibility.)

The reason I'm opening this issue in fsspec is that modern fsspec seems to play poorly with ancient s3fs. As a concrete example, consider

Dockerfile:

FROM mambaorg/micromamba

RUN micromamba install -y -n base -c conda-forge fsspec "boto3>1.26.76,<=1.26.132" s3fs 

COPY test.py /

CMD ["python", "/test.py"]

test.py:

import fsspec

with fsspec.open(f"simplecache::s3://some-writeable-bucket/hello.txt", mode="wb") as f:
    print(f.write(b"Hello, world!"))

in Bash:

docker build -t fsspec-debug .
docker run --rm -it -v "$HOME/.aws:/home/mambauser/.aws:ro" -e AWS_DEFAULT_PROFILE=some-profile fsspec-debug

Output:

13                                                                                   
Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/s3fs/core.py", line 394, in _lsdir
    for i in it:
  File "/opt/conda/lib/python3.11/site-packages/botocore/paginate.py", line 269, in __iter__
    response = self._make_request(current_kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/botocore/paginate.py", line 357, in _make_request
    return self._method(**current_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/botocore/client.py", line 530, in _api_call
    return self._make_api_call(operation_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/botocore/client.py", line 960, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/test.py", line 3, in <module>
    with fsspec.open(f"simplecache::s3://some-writeable-bucket/hello.txt", mode="wb") as f:
  File "/opt/conda/lib/python3.11/site-packages/fsspec/core.py", line 121, in __exit__
    self.close()
  File "/opt/conda/lib/python3.11/site-packages/fsspec/core.py", line 141, in close
    f.close()
  File "/opt/conda/lib/python3.11/site-packages/fsspec/implementations/cached.py", line 816, in close
    self.commit()
  File "/opt/conda/lib/python3.11/site-packages/fsspec/implementations/cached.py", line 823, in commit
    self.fs.put(self.fn, self.path)
  File "/opt/conda/lib/python3.11/site-packages/fsspec/spec.py", line 958, in put
    lpaths = [p for p in lpaths if not (trailing_sep(p) or self.isdir(p))]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/fsspec/spec.py", line 958, in <listcomp>
    lpaths = [p for p in lpaths if not (trailing_sep(p) or self.isdir(p))]
                                                           ^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/s3fs/core.py", line 601, in isdir
    return bool(self._lsdir(path))
                ^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/s3fs/core.py", line 409, in _lsdir
    raise translate_boto_error(e)
PermissionError: Access Denied

Curiously, this seems to have surfaced only since the very recent https://github.com/fsspec/filesystem_spec/pull/1254, with the introduction of line 960 in spec.py, which calls self.isdir(lpaths[0]). It seems that self is S3 while lpaths[0] is in the local tmp/ directory, so it's not surprising that S3 would give access denied to a bucket named tmp. (I'm not sure why newer versions of s3fs don't lead to this error, I don't have time to investigate further.)

So finally, the reason I'm raising this issue here is that I suspect the proper solution would be to do a repodata patch and add a run_constrained of s3fs >0.4.2 to modern fsspec packages. This would give it a kick to choose an s3fs version with aiobotocore and would probably lead to more sensible solutions from the solver.

Installed packages

_libgcc_mutex     0.1       conda_forge         conda-forge
  _openmp_mutex     4.5       2_gnu               conda-forge
  boto3             1.26.132  pyhd8ed1ab_0        conda-forge
  botocore          1.29.132  pyhd8ed1ab_0        conda-forge
  brotlipy          0.7.0     py311hd4cff14_1005  conda-forge
  bzip2             1.0.8     h7f98852_4          conda-forge
  ca-certificates   2023.5.7  hbcca054_0          conda-forge
  certifi           2023.5.7  pyhd8ed1ab_0        conda-forge
  cffi              1.15.1    py311h409f033_3     conda-forge
  cryptography      40.0.2    py311h9b4c7bb_0     conda-forge
  fsspec            2023.5.0  pyh1a96a4e_0        conda-forge
  idna              3.4       pyhd8ed1ab_0        conda-forge
  jmespath          1.0.1     pyhd8ed1ab_0        conda-forge
  ld_impl_linux-64  2.40      h41732ed_0          conda-forge
  libexpat          2.5.0     hcb278e6_1          conda-forge
  libffi            3.4.2     h7f98852_5          conda-forge
  libgcc-ng         12.2.0    h65d4601_19         conda-forge
  libgomp           12.2.0    h65d4601_19         conda-forge
  libnsl            2.0.0     h7f98852_0          conda-forge
  libsqlite         3.41.2    h2797004_1          conda-forge
  libuuid           2.38.1    h0b41bf4_0          conda-forge
  libzlib           1.2.13    h166bdaf_4          conda-forge
  ncurses           6.3       h27087fc_1          conda-forge
  openssl           3.1.0     hd590300_3          conda-forge
  pip               23.1.2    pyhd8ed1ab_0        conda-forge
  pycparser         2.21      pyhd8ed1ab_0        conda-forge
  pyopenssl         23.1.1    pyhd8ed1ab_0        conda-forge
  pysocks           1.7.1     pyha2e5f31_6        conda-forge
  python            3.11.3    h2755cc3_0_cpython  conda-forge
  python-dateutil   2.8.2     pyhd8ed1ab_0        conda-forge
  python_abi        3.11      3_cp311             conda-forge
  readline          8.2       h8228510_1          conda-forge
  s3fs              0.4.2     py_0                conda-forge
  s3transfer        0.6.1     pyhd8ed1ab_0        conda-forge
  setuptools        67.7.2    pyhd8ed1ab_0        conda-forge
  six               1.16.0    pyh6c4a22f_0        conda-forge
  tk                8.6.12    h27826a3_0          conda-forge
  tzdata            2023c     h71feb2d_0          conda-forge
  urllib3           1.26.15   pyhd8ed1ab_0        conda-forge
  wheel             0.40.0    pyhd8ed1ab_0        conda-forge
  xz                5.2.6     h166bdaf_0          conda-forge

### Environment info

```shell
environment : base (active)
           env location : /opt/conda
      user config files : /home/mambauser/.mambarc
 populated config files : 
       libmamba version : 1.4.2
     micromamba version : 1.4.2
           curl version : libcurl/7.88.1 OpenSSL/3.1.0 zlib/1.2.13 zstd/1.5.2 libssh2/1.10.0 nghttp2/1.52.0
     libarchive version : libarchive 3.6.2 zlib/1.2.13 bz2lib/1.0.8 libzstd/1.5.2
       virtual packages : __unix=0=0
                          __linux=5.15.0=0
                          __glibc=2.31=0
                          __archspec=1=x86_64
               channels : 
       base environment : /opt/conda
               platform : linux-64
djhoese commented 6 months ago

This looks like it is happening again. Here's a failing CI run for my package (satpy): https://github.com/pytroll/satpy/actions/runs/8338561408/job/22819069867?pr=2762

s3fs 0.6.0 is being installed and an old incompatible botocore with an old vendored requests.

djhoese commented 6 months ago

And this happened in the past too: https://github.com/conda-forge/filesystem-spec-feedstock/pull/86#issuecomment-1841505464

martindurant commented 6 months ago

https://github.com/conda-forge/s3fs-feedstock/blob/main/recipe/meta.yaml#L22 says that we have now very unconstrained aiobotocore requirements, and these days aiobotocore and botocore should be released together.

Is it possible this is only intermittent while the packages are being built by conda-forge?

Given that 0.6.0 is now almost 3 years old, I wonder if we should simply remove (yank) all old versions from pypi.

djhoese commented 6 months ago

from pypi.

From conda-forge?

martindurant commented 6 months ago

No, I had meant pypi, because this has happened with pip too (sorry). conda ought to be much more thorough - I don't know how to get information out of the solver for what it's doing. If I try to install s3fs, I get the current version. Can you point to your environment file, please?

djhoese commented 6 months ago

https://github.com/pytroll/satpy/blob/main/continuous_integration/environment.yaml

martindurant commented 6 months ago

Trying to build that right now wanted to pull

  - conda-forge/noarch::s3fs==2024.3.1=pyhd8ed1ab_0

as it should. Can you try again, maybe it was indeed because conda-forge builds were in flight or their index hadn't updated?