pypa / pip

The Python package installer
https://pip.pypa.io/
MIT License
9.36k stars 2.98k forks source link

`kedro[test]==0.18.13` fails even though requiring its depndencies do not #12768

Open notatallshaw opened 2 weeks ago

notatallshaw commented 2 weeks ago

Description

On Linux Python 3.9 installing kedro[test]==0.18.13 fails with resolution impossible

However if you directly specify it's requirements as extracted from the metadata (https://files.pythonhosted.org/packages/50/01/76d44fd50471cd1bd9899e161c07444026c954c01a9012b2c3a8f8a9e1c5/kedro-0.18.13-py3-none-any.whl.metadata) e.g.:

requirements.txt
``` anyconfig <0.14,>=0.10 attrs >=21.3 build cachetools ~=5.3 click <9.0 cookiecutter <3.0,>=2.1.1 dynaconf <4.0,>=3.1.2 fsspec <2024.1,>=2021.4 gitpython ~=3.0 importlib-resources >=1.3 jmespath <2.0,>=0.9.5 more-itertools <11,>=9 omegaconf ~=2.3 parse ~=1.19.0 pip-tools <8,>=6.5 pluggy <1.3,>=1.0 PyYAML <7.0,>=4.2 rich <14.0,>=12.0 rope <2.0,>=0.21 setuptools >=65.5.1 toml ~=0.10 toposort ~=1.5 importlib-metadata <5.0,>=3.6 ; python_version < "3.8" importlib-metadata >=3.6 ; python_version >= "3.8" bandit <2.0,>=1.6.2 behave ==1.2.6 biopython ~=1.73 blacken-docs ==1.9.2 black ~=22.0 compress-pickle[lz4] ~=2.1.0 coverage[toml] dask[complete] ~=2021.10 dill ~=0.3.1 filelock <4.0,>=3.4.0 geopandas <1.0,>=0.6.0 hdfs <3.0,>=2.5.8 holoviews >=1.13.0 import-linter[toml] ==1.8.0 isort ~=5.0 Jinja2 <3.1.0 joblib >=0.14 jupyterlab-server <2.16.0,>=2.11.1 jupyterlab <3.6.0,~=3.0 jupyter ~=1.0 lxml ~=4.6 memory-profiler <1.0,>=0.50.0 networkx ~=2.4 opencv-python ~=4.5.5.64 openpyxl <4.0,>=3.0.3 pandas ~=1.3 Pillow ~=9.0 plotly <6.0,>=4.8.0 pre-commit <3.0,>=2.9.2 pylint <3.0,>=2.17.0 pyproj ~=3.0 pytest-cov ~=3.0 pytest-mock <2.0,>=1.7.1 pytest-xdist[psutil] ~=2.2.1 pytest ~=7.2 redis ~=4.1 requests-mock ~=1.6 requests ~=2.20 s3fs <0.5,>=0.3.0 scikit-learn <2,>=1.0.2 scipy >=1.7.3 semver SQLAlchemy ~=1.2 triad <1.0,>=0.6.7 trufflehog ~=2.1 xlsxwriter ~=1.0 tensorflow ~=2.0 ; (platform_system != "Darwin" or platform_machine != "arm64") tables ~=3.6 ; (platform_system != "Windows") tensorflow-macos ~=2.0 ; (platform_system == "Darwin" and platform_machine == "arm64") tables ~=3.6.0 ; (platform_system == "Windows" and python_version < "3.8") tables ~=3.8.0 ; (platform_system == "Windows" and python_version >= "3.8") matplotlib <3.4,>=3.0.3 ; (python_version < "3.10") moto ==1.3.7 ; (python_version < "3.10") delta-spark ~=1.2.1 ; (python_version < "3.11") pandas-gbq <0.18.0,>=0.12.0 ; (python_version < "3.11") pyarrow >=1.0 ; (python_version < "3.11") pyspark <3.4,>=2.2 ; (python_version < "3.11") ipython <8.0,>=7.31.1 ; (python_version < "3.8") adlfs <=2022.2,>=2021.7.1 ; (python_version == "3.7") gcsfs <=2023.1,>=2021.4 ; (python_version == "3.7") matplotlib <3.6,>=3.5 ; (python_version >= "3.10") moto ==4.1.12 ; (python_version >= "3.10") delta-spark >=1.2.1 ; (python_version >= "3.11") pandas-gbq >=0.18.0 ; (python_version >= "3.11") pyarrow >=7.0 ; (python_version >= "3.11") pyspark >=3.4 ; (python_version >= "3.11") adlfs ~=2023.1 ; (python_version >= "3.8") gcsfs <2023.3,>=2023.1 ; (python_version >= "3.8") ipython ~=8.10 ; (python_version >= "3.8") ```

Then it installs fine. Also uv and older vesions of pip can install kedro[test]==0.18.13.

Expected behavior

kedro[test]==0.18.13 installs fine

pip version

24.0

Python version

3.9

OS

Linux

How to Reproduce

  1. python -m pip install --ignore-installed --dry-run "kedro[test]==0.18.13"

Output

ERROR: Cannot install dask[complete]==2021.12.0 and kedro[test]==0.18.13 because these package versions have conflicting dependencies.

The conflict is caused by:
    kedro[test] 0.18.13 depends on dask~=2021.10; extra == "test"
    dask[complete] 2021.12.0 depends on dask 2021.12.0 (from https://files.pythonhosted.org/packages/15/6d/99c63be3ea8a4a651d845addeea1f1b3bb8e5c6730bc26cfb6176631adf7/dask-2021.12.0-py3-none-any.whl (from https://pypi.org/simple/dask/) (requires-python:>=3.7))

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip to attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

Code of Conduct

notatallshaw commented 2 weeks ago

Probably realted to https://github.com/pypa/pip/issues/12317, but other requirement examples are fixed by https://github.com/sarugaku/resolvelib/pull/152 so I wanted to track this one, and it's peculiarities, seperately.

notatallshaw commented 2 weeks ago

Here is a relatively minimal reproducer example that's easier to prove is an issue:

Create a pyproject.toml like so:

[build-system]
requires = ["setuptools"] 
build-backend = "setuptools.build_meta"

[project]
name = "reproducer"
authors = [
    {name = "reproducer"}
]
version = "0.0.1"
description = "reproducer"
dependencies = [
    "adlfs~=2023.1",
    "dask[complete]~=2021.10",
    "gcsfs>=2023.1, <2023.3",
]

Run python -m pip install --ignore-installed --dry-run . and get the error:

ERROR: Cannot install dask[complete]==2021.12.0 and reproducer==0.0.1 because these package versions have conflicting dependencies.

The conflict is caused by:
    reproducer 0.0.1 depends on dask~=2021.10
    dask[complete] 2021.12.0 depends on dask 2021.12.0 (from https://files.pythonhosted.org/packages/15/6d/99c63be3ea8a4a651d845addeea1f1b3bb8e5c6730bc26cfb6176631adf7/dask-2021.12.0-py3-none-any.whl (from https://pypi.org/simple/dask/) (requires-python:>=3.7))

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip to attempt to solve the dependency conflict

Now run python -m pip install --ignore-installed --dry-run "adlfs~=2023.1" "dask[complete]~=2021.10" "gcsfs>=2023.1, <2023.3" and it completes:

Would install Jinja2-3.1.4 MarkupSafe-2.1.5 PyJWT-2.8.0 PyYAML-6.0.1 adlfs-2023.8.0 aiohttp-3.9.5 aiosignal-1.3.1 async-timeout-4.0.3 attrs-23.2.0 azure-core-1.30.2 azure-datalake-store-0.0.53 azure-identity-1.16.1 azure-storage-blob-12.20.0 bokeh-3.4.1 cachetools-5.3.3 certifi-2024.6.2 cffi-1.16.0 charset-normalizer-3.3.2 click-8.1.7 cloudpickle-3.0.0 contourpy-1.2.1 cryptography-42.0.8 dask-2021.12.0 decorator-5.1.1 distributed-2021.12.0 frozenlist-1.4.1 fsspec-2023.1.0 gcsfs-2023.1.0 google-api-core-2.19.0 google-auth-2.30.0 google-auth-oauthlib-1.2.0 google-cloud-core-2.4.1 google-cloud-storage-2.17.0 google-crc32c-1.5.0 google-resumable-media-2.7.1 googleapis-common-protos-1.63.1 idna-3.7 isodate-0.6.1 locket-1.0.0 msal-1.28.1 msal-extensions-1.1.0 msgpack-1.0.8 multidict-6.0.5 numpy-1.26.4 oauthlib-3.2.2 packaging-24.1 pandas-2.2.2 partd-1.4.2 pillow-10.3.0 portalocker-2.8.2 proto-plus-1.23.0 protobuf-4.25.3 psutil-5.9.8 pyasn1-0.6.0 pyasn1_modules-0.4.0 pycparser-2.22 python-dateutil-2.9.0.post0 pytz-2024.1 requests-2.32.3 requests-oauthlib-2.0.0 rsa-4.9 setuptools-70.0.0 six-1.16.0 sortedcontainers-2.4.0 tblib-3.0.0 toolz-0.12.1 tornado-6.4.1 typing_extensions-4.12.2 tzdata-2024.1 urllib3-2.2.1 xyzservices-2024.6.0 yarl-1.9.4 zict-3.0.0

pradyunsg commented 5 days ago

@notatallshaw IIUC, this has been fixed in resolvelib and we need to pull in that fix here?

notatallshaw commented 5 days ago

No, this is an example of a requirement not fixed by the fallback implemented in resolvelib.

Either, backjumping needs to be removed (or made optional) to fix this, or we go with https://github.com/sarugaku/resolvelib/issues/134#issuecomment-2180745024 and document the provider needs to behave in a certain way, and then we will need to update the pip provider to comply.

I am working on a PR on the resolvelib side to document this behavior, and give an example with a test provider, and then we'll see where we go from there.