python-poetry / poetry

Python packaging and dependency management made easy
https://python-poetry.org
MIT License
31.89k stars 2.28k forks source link

Multiple constraint dependency resolution is unexpected #9687

Closed eickr closed 2 months ago

eickr commented 2 months ago

Description

When using multiple constraints syntax to tell poetry that it should handle a dependency differently for different versions of python, if the dependency's version constraint is equivalent then poetry short circuits unexpectedly and does not resolve to the latest viable version for the current environment.

Given the example pyproject.toml file provided, I would expect poetry to install numpy v1.24.4 in a python v3.8 environment and to install numpy v1.26.4 in a python v3.11 environment.

Instead, poetry will select numpy v1.24.4 for both python v3.8 and v3.11 environments. If the venv is clean then it will install numpy v1.24.4 for both environments.
Interestingly, if numpy v1.26.4 is already installed in a python v3.11 environment, then despite selecting numpy v1.24.4, poetry will not downgrade to it.

One might argue that the constraint for python v3.8 restricted numpy should be capped below v1.25.0 to make it different, since that's the version that dropped v3.8 support. However, that knowledge is not always available. Pretend that numpy v1.24 wasn't even available when this example project was last touched. If the dependency resolution worked as expected then this example project wouldn't need to change the constraint and rebuild.

My best guess about what is happening here is that poetry might be merging the two multiple constraints together into a single numpy = { version = "^1.23.4", python = ">=3.8,<3.12" } constraint. That behavior would make sense... if it resulted in poetry installing numpy v1.26.4 in a python v3.11 environment instead of installing numpy v1.24.4.

Workarounds

If you differ the dependency's version constraint slightly then poetry is able to resolve the dependency to the latest viable version that you would expect.

For example, if you replaced the numpy constraint in the example pyproject.toml with the following then it will resolve as expected.

numpy = [
    { version = "^1.23.4", python = ">=3.8,<3.9" },
    { version = "^1.23.5", python = ">=3.9,<3.12" },
]

or

numpy = [
    { version = "^1.23.5", python = ">=3.8,<3.9" },
    { version = "^1.24.0", python = ">=3.9,<3.12" },
]

Either of those will result in poetry installing numpy v1.26.4 in a python v3.11 environment and installing numpy v1.24.4 in a python v3.8 environment.

Poetry Installation Method

pip

Operating System

Ubuntu 22.04

Poetry Version

Poetry (version 1.8.3)

Poetry Configuration

cache-dir = "/home/REDACTED/.cache/pypoetry"
experimental.system-git-client = false
installer.max-workers = null
installer.modern-installation = true
installer.no-binary = null
installer.parallel = true
keyring.enabled = true
solver.lazy-wheel = true
virtualenvs.create = true
virtualenvs.in-project = null
virtualenvs.options.always-copy = false
virtualenvs.options.no-pip = false
virtualenvs.options.no-setuptools = false
virtualenvs.options.system-site-packages = false
virtualenvs.path = "{cache-dir}/virtualenvs"  # /home/REDACTED/.cache/pypoetry/virtualenvs
virtualenvs.prefer-active-python = false
virtualenvs.prompt = "{project_name}-py{python_version}"
warnings.export = true

Python Sysconfig

No response

Example pyproject.toml

[tool.poetry]
name = "tmptest"
version = "0.1.0"
description = ""
authors = ["REDACTED"]

[tool.poetry.dependencies]
python = ">=3.8,<3.12"
numpy = [
    { version = "^1.23.4", python = ">=3.8,<3.9" },
    { version = "^1.23.4", python = ">=3.9,<3.12" }
]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Poetry Runtime Logs

######## EXAMPLE 1 - Unexpected

REDACTED@REDACTED:~/REDACTED/tmpTest$ poetry3.11 -vvv update
Loading configuration file /home/REDACTED/.config/pypoetry/config.toml
Using virtualenv: /home/REDACTED/.cache/pypoetry/virtualenvs/tmptest-jkdiss-p-py3.11
Updating dependencies
Resolving dependencies...
   1: fact: tmptest is 0.1.0
   1: derived: tmptest
   0: Duplicate dependencies for numpy
   0: Merging requirements for numpy
   1: fact: tmptest depends on numpy (>=1.23.4,<2.0.0)
   1: selecting tmptest (0.1.0)
   1: derived: numpy (>=1.23.4,<2.0.0)
........
Creating new session for pypi.org
Source (PyPI): 15 packages found for numpy >=1.23.4,<2.0.0
   1: fact: numpy (1.26.4) requires Python >=3.9
   1: derived: not numpy (==1.26.4)
   1: fact: numpy (1.26.3) requires Python >=3.9
   1: derived: not numpy (==1.26.3)
   1: fact: numpy (1.26.2) requires Python >=3.9
   1: derived: not numpy (==1.26.2)
   1: fact: numpy (1.26.1) requires Python <3.13,>=3.9
   1: derived: not numpy (==1.26.1)
   1: fact: numpy (1.26.0) requires Python <3.13,>=3.9
   1: derived: not numpy (==1.26.0)
   1: fact: numpy (1.25.2) requires Python >=3.9
   1: derived: not numpy (==1.25.2)
   1: fact: numpy (1.25.1) requires Python >=3.9
   1: derived: not numpy (==1.25.1)
   1: fact: numpy (1.25.0) requires Python >=3.9
   1: derived: not numpy (==1.25.0)
   1: selecting numpy (1.24.4)
   1: Version solving took 0.596 seconds.
   1: Tried 1 solutions.

Finding the necessary packages for the current system
Source (PyPI): 2 packages found for numpy >=1.23.4,<2.0.0

# If venv is fresh then it installs 1.24.4
Package operations: 1 install, 0 updates, 0 removals

  - Installing numpy (1.24.4): Pending...

# If numpy 1.26.4 is already installed and there is no lock file then it downgrades
Package operations: 0 installs, 1 update, 0 removals

  - Downgrading numpy (1.26.4 -> 1.24.4)

# But if numpy 1.26.4 is already installed and a lock existed containing 1.24.4 and 1.26.4 then it does nothing
No dependencies to install or update

======== EXAMPLE 2 - Workaround

REDACTED@REDACTED:~/REDACTED/tmpTest$ poetry3.11 -vvv update
Loading configuration file /home/REDACTED/.config/pypoetry/config.toml
Using virtualenv: /home/REDACTED/.cache/pypoetry/virtualenvs/tmptest-jkdiss-p-py3.11
Updating dependencies
Resolving dependencies...
   1: fact: tmptest is 0.1.0
   1: derived: tmptest
   0: Duplicate dependencies for numpy
   0: Different requirements found for numpy (>=1.23.4,<2.0.0) with markers python_version >= "3.8" and python_version < "3.9" and numpy (>=1.23.5,<2.0.0) with markers python_version >= "3.9" and python_version < "3.12".
   1: Version solving took 0.003 seconds.
   1: Tried 1 solutions.
   0: Retrying dependency resolution with the following overrides ({Package('tmptest', '0.1.0'): {'numpy': <Dependency numpy (>=1.23.4,<2.0.0)>}}).
   1: fact: tmptest is 0.1.0
   1: derived: tmptest
   1: fact: tmptest depends on numpy (>=1.23.4,<2.0.0)
   1: selecting tmptest (0.1.0)
   1: derived: numpy (>=1.23.4,<2.0.0)
........
Creating new session for pypi.org
Source (PyPI): 15 packages found for numpy >=1.23.4,<2.0.0
   1: fact: numpy (1.26.4) requires Python >=3.9
   1: derived: not numpy (==1.26.4)
   1: fact: numpy (1.26.3) requires Python >=3.9
   1: derived: not numpy (==1.26.3)
   1: fact: numpy (1.26.2) requires Python >=3.9
   1: derived: not numpy (==1.26.2)
   1: fact: numpy (1.26.1) requires Python <3.13,>=3.9
   1: derived: not numpy (==1.26.1)
   1: fact: numpy (1.26.0) requires Python <3.13,>=3.9
   1: derived: not numpy (==1.26.0)
   1: fact: numpy (1.25.2) requires Python >=3.9
   1: derived: not numpy (==1.25.2)
   1: fact: numpy (1.25.1) requires Python >=3.9
   1: derived: not numpy (==1.25.1)
   1: fact: numpy (1.25.0) requires Python >=3.9
   1: derived: not numpy (==1.25.0)
   1: selecting numpy (1.24.4)
   1: Version solving took 0.420 seconds.
   1: Tried 1 solutions.
   0: Retrying dependency resolution with the following overrides ({Package('tmptest', '0.1.0'): {'numpy': <Dependency numpy (>=1.23.5,<2.0.0)>}}).
   1: fact: tmptest is 0.1.0
   1: derived: tmptest
   1: fact: tmptest depends on numpy (>=1.23.5,<2.0.0)
   1: selecting tmptest (0.1.0)
   1: derived: numpy (>=1.23.5,<2.0.0)
Source (PyPI): 14 packages found for numpy >=1.23.5,<2.0.0
   1: selecting numpy (1.26.4)
   1: Version solving took 0.002 seconds.
   1: Tried 1 solutions.
   0: Complete version solving took 0.426 seconds with 2 overrides
   0: Resolved with overrides: ({Package('tmptest', '0.1.0'): {'numpy': <Dependency numpy (>=1.23.4,<2.0.0)>}}), ({Package('tmptest', '0.1.0'): {'numpy': <Dependency numpy (>=1.23.5,<2.0.0)>}})

Finding the necessary packages for the current system
Source (PyPI): 2 packages found for numpy >=1.23.5,<2.0.0

# If numpy 1.24.4 is already installed then it updates it
Package operations: 0 installs, 1 update, 0 removals

  - Updating numpy (1.24.4 -> 1.26.4): Pending...

# But if numpy 1.26.4 is already installed then it does nothing
No dependencies to install or update
eickr commented 2 months ago

Might be related to #5858 and/or #5506. This seemed distinct enough to warrant a separate issue (hopefully with a more trivial fix).

dimbleby commented 2 months ago

poetry is just finding the simplest solution that it can to the problem that you set it.

if you want it to install a later version of numpy, then make that a requirement

eickr commented 2 months ago

if you want it to install a later version of numpy, then make that a requirement

How? By increasing the lower bound on the second constraint? That would prevent anything using this project as a dependency from being able to use the previously valid lower versions of numpy.
By putting a tighter upper bound on the first constraint? You don't always know what that upper bound will be. If you guessed too tightly then you've artificially prevented your project from being used as a dependency with those valid later versions in the future, potentially forcing you to come back and rebuild and redistribute.

Neither of those options are desirable. If I could simply use numpy = "^1.23.2" and it would select v1.26.4 in a python v3.11 environment, and select v1.24.4 in a python v3.8 environment, then that would be ideal. But it selects v1.24.4 for both. So I turned to the multiple constraint syntax to make poetry handle each constraint separately. Poetry recognizes them as distinctly different constraints, but against expectations it resolves them no differently than the single constraint syntax.

poetry is just finding the simplest solution that it can to the problem that you set it.

Is it, though? By running poetry update in a python v3.11 environment, the user is effectively telling poetry to solve numpy = "^1.23.4" for python v3.11. After deriving >=1.23.4, <2.0.0 it queries pypi for numpy and finds version v1.26.4 which works for the current environment. Instead of stopping there, it then continues to query pypi, stepping through 8 more lesser versions before deciding that v1.24.4 is the one it wants to select and use.

Wouldn't the simplest solution be the following?

  1. Found different requirements for numpy
  2. Compare current environment against requirements to keep only the applicable requirements.
  3. Derives numpy requirements, which would be numpy (>=1.23.4,<2.0.0) with markers python_version >= "3.9" and python_version < "3.12" as that was the only applicable requirement.
  4. Query pypi for numpy (>=1.23.4,<2.0.0), gets 16 hits
  5. First hit is v1.26.4, which requires Python >=3.9. That meets the full requirements so it's selected.

I won't pretend to understand poetry's internals, so maybe I'm way off base and/or there's a specific reason that type of resolution is avoided.

dimbleby commented 2 months ago

If you are satisfied that numpy 1.24.4 is a valid solution for all python versions then you have no reason to complain about poetry choosing it.

For packages that you are intending to distribute, I mostly think of the lock file as a proof of the existence of a solution: it is not the solution that any consumer will necessarily get. Indeed you are under no obligation to install from the lock file yourself: pip install, if you like the answer better.

eickr commented 2 months ago

If you are satisfied that numpy 1.24.4 is a valid solution for all python versions then you have no reason to complain about poetry choosing it.

I would have no complaints if my pyproject.toml file merely contained python = ">=3.8,<3.12", numpy = "^1.23.2". I would have made a feature/change request if the multiple constraint dependency syntax documented that it was intended for entries with identical version values to have their other conditions merged together.

Perhaps I am not adequately conveying why the behavior described in this issue is a problem. Is there any use case where the current behavior is desired?

For packages that you are intending to distribute, I mostly think of the lock file as a proof of the existence of a solution: it is not the solution that any consumer will necessarily get. Indeed you are under no obligation to install from the lock file yourself: pip install, if you like the answer better.

I initially intended to reply that the lock file is merely an artifact affected by this issue, not really related to the core issue here as they aren't required to distribute applications. However, further testing has revealed another unexpected difference in behavior depending on the lock file. See the three comments at the end of example 1 in the poetry runtime logs section.

Regardless, I can only see the advice to use pip as an agreement that there's a problem with poetry. There's already feature requests regarding the ability to specify an arbitrary (valid) version. If the root cause of this issue is the same thing making that feature request difficult to implement, then this can eventually be wrapped up into that.
I've made this issue as it is because this behavior is inconsistent or unintuitive when the current documentation makes it seem like my expectations are supported. "Poetry simply resolves all dependencies listed in your pyproject.toml file and downloads the latest version of their files." Multiple constraints dependencies: "The constraints must have different requirements (like python) otherwise it will cause an error when resolving dependencies."

In the meantime, I've found a workaround that I can consider as acceptable enough:

python = ">=3.8,<3.12"
numpy = [
    { version = "^1.23.2 || >99999", python = ">=3.8,<3.9" },
    { version = "^1.23.2", python = ">=3.9,<3.12" },
]
dimbleby commented 2 months ago

The docs that you quote do not support your expectation.

It is clearly not the case that poetry will always lock everything to its latest version. (If only it could always be so easy!) In all sorts of circumstances: if you don't like the valid solution that poetry finds to your constraints - then you need to change your constraints.

You are saying both that

I do not think it is likely that there will be a change to the solver here.

eickr commented 2 months ago

The docs that you quote do not support your expectation.

It is clearly not the case that poetry will always lock everything to its latest version. (If only it could always be so easy!) In all sorts of circumstances: if you don't like the valid solution that poetry finds to your constraints - then you need to change your constraints.

Would you agree that poetry update will always lock the latest version(s) of the "valid solution(s) that poetry finds to your constraints"?
Would you agree that a "multiple constraint dependency" array containing multiple entries with the same "version" value is valid and supported, at least as long as each of those entries has a unique combination of markers?

You are saying both that

  • earlier versions of numpy are absolutely fine with your project, you don't want to raise the lower bound
  • earlier versions of numpy are not fine with your project, you insist that you want poetry to use a later version

I do not think it is likely that there will be a change to the solver here.

I am not saying that second bullet. The earlier versions down to the minimum bound are fine with the project. Numpy v1.24.4 for python v3.11 is a valid version given the constraints.

The problem, as I see it, is the inconsistent solver behavior seemingly due to the (undocumented) merging of conditional environment markers during solving.

The docs say poetry supports PEP 508 markers. PEP 508 says "A marker expression evaluates to either True or False. When it evaluates to False, the dependency specification should be ignored."
The expression python_version >= "3.8" and python_version < "3.12" is not equivalent to either python_version >= "3.8" and python_version < "3.9" nor python_version >= "3.9" and python_version < "3.12".
Given that, and the rest of the available documentation, I can't see why anyone would expect the results of the fourth bullet to match the results of the first bullet instead of the results of the second bullet.

However, PEP 508 also says "It draws a border at the edge of describing a single dependency - the different sorts of dependencies and when they should be installed is a higher level problem.". So, technically, I don't think poetry is violating it by merging multiple specifications into a new non-equivalent specification.
If the current behavior is the intended behavior, then I merely think that the documentation should be improved to make that behavior obvious.
Any arguments for changing the behavior can be made separately after the intended behavior is confirmed, from someone who has far more free time than us.

dimbleby commented 2 months ago

You are exploring non-essential behaviours of the poetry resolver and trying to rely on them.

If poetry happened to do what you want already, and in that world someone showed up complaining that it failed to combine two obviously-combinable requirements - they would have just as much of a point as you do. Either behaviour seems plausible, the docs make no commitment either way, I see no good reason to change in either direction.

fwiw both uv and pdm take the same approach as poetry here, the result of

dependencies = [
    "numpy>=1.23.2,<2.0.0 ; python_version >= '3.8' and python_version < '3.9'",
    "numpy>=1.23.2,<2.0.0 ; python_version >= '3.9' and python_version < '3.12'",
]

for both uv lock and pdm lock is that the lockfile contains the single numpy version 1.24.4.

eickr commented 2 months ago

I agree that, given the reason the solver is doing what it's doing, there is probably no good argument to say one way would be definitively better than the other. Well, unless you wanted to use poetry to manage dependencies for automatic testing without needing multiple pyproject.toml files, but that's well outside the scope of this issue.

In hindsight, I just wish a user didn't have to delve into the source code to find out that poetry will intentionally try to merge the conditionals of all entries for a dependency that have the same version constraint key. Or rather, to find out that the behavior I was seeking is not currently possible intentionally.
After all, what sane person would try to use the multiple constraint syntax like I was if they wanted the behavior that python = ">=3.8,<3.12" and numpy = "^1.23.2" already gave them.

If you think that there's not enough reason to update the documentation to clarify the current behavior and capabilities, then you can close this issue. Maybe this issue itself will be enough for anyone scratching their head to find the clarification they need without needing to understand the source code.

radoering commented 2 months ago

I agree with dimbleby. It is an implementation detail and by cleverly changing the constraints, you influence Poetry according to your wishes. However, since it is an implementation detail it is pure luck and can change from one version to another.

A sustainable solution might be a resolution strategy that does one resolution per Python minor version. The downside is that resolution will be much slower with such a strategy. If we ever implement alternative resolution strategies that might be a thing.

github-actions[bot] commented 1 month ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.