astral-sh / uv

An extremely fast Python package installer and resolver, written in Rust.
https://astral.sh/
Apache License 2.0
14.72k stars 419 forks source link

Markers of splits in universal bluejay resolution are wrong #4687

Open konstin opened 6 days ago

konstin commented 6 days ago

When resolving transformers, the marker expressions on the forks are non-sensical:

DEBUG Splitting resolution on opencv-python over numpy
DEBUG Pre-fork split universal took 0.017s
DEBUG Solving split python_version >= '3.12' and platform_machine == 'aarch64' and platform_system == 'Darwin' and platform_system == 'Linux'
DEBUG Split python_version >= '3.12' and platform_machine == 'aarch64' and platform_system == 'Darwin' and platform_system == 'Linux' took 1.075s
DEBUG Solving split python_version == '3.9' and platform_machine == 'arm64' and platform_system == 'Darwin'
DEBUG Split python_version == '3.9' and platform_machine == 'arm64' and platform_system == 'Darwin' took 0.270s

platform_system == 'Darwin' and platform_system == 'Linux' is ∅, i.e. those split can never be reached, and the union of the splits is not universal.

I've tried to minimize is a bit, but the problem did not occur when using uv pip install --universal or when using direct dependencies with `tool.uv.source, i.e. i think it needs some previous (dummy?) splits, and hence the half-done MRE:

Use this pyproject.toml, edit the overrides to point to two dummy projects. opencv-python splits, foo is our reporter.

[project]
name = "opencv-python"
version = "4.10.0.84"
description = "Default template for PDM package"
authors = [
    {name = "konstin", email = "konstin@mailbox.org"},
]
dependencies = [
    "cmake>=3.1",
    'numpy >=1.13.3 ; python_version < "3.7"',
    'numpy >=1.21.0 ; python_version <= "3.9" and platform_system == "Darwin" and platform_machine == "arm64"',
    'numpy >=1.21.2 ; python_version >= "3.10"',
    'numpy >=1.21.4 ; python_version >= "3.10" and platform_system == "Darwin"',
    'numpy >=1.23.5 ; python_version >= "3.11"',
    'numpy >=1.26.0 ; python_version >= "3.12"',
    'numpy >=1.19.3 ; python_version >= "3.6" and platform_system == "Linux" and platform_machine == "aarch64"',
    'numpy >=1.17.0 ; python_version >= "3.7"',
    'numpy >=1.17.3 ; python_version >= "3.8"',
    'numpy >=1.19.3 ; python_version >= "3.9"',
    "foo"
]
requires-python = ">=3.9"
license = {text = "MIT"}
[project]
name = "foo"
version = "0.1.0"
description = "Default template for PDM package"
authors = [
    {name = "konstin", email = "konstin@mailbox.org"},
]
dependencies = [
    "anyio==4.3.0; platform_machine == 'x86_64'",
    "maturin>=1,<2"
]
requires-python = ">=3.9"
license = {text = "MIT"}

Run this on a 64-bit x86 machine or edit the marker to something else that's your machine but excluded in the lock markers. Run uv lock -v. In the lockfile we find

[[distribution]]
name = "foo"
version = "0.1.0"
source = { directory = "/home/konsti/projects/transformers/foo" }
dependencies = [
    { name = "maturin" },
]

and anyio 4.4.0, missing our constraint! It looks like we're only solving for a subset of markers in this case.

The marker on opencv-python -> numpy also looks wrong:

[[distribution]]
name = "opencv-python"
version = "4.10.0.84"
source = { directory = "/home/konsti/projects/transformers/opencv-python" }
dependencies = [
    { name = "cmake" },
    { name = "foo" },
    { name = "numpy", marker = "python_version >= '3.7' or (python_version <= '3.9' and platform_machine == 'arm64' and platform_system == 'Darwin') or (python_version >= '3.6' and platform_machine == 'aarch64' and platform_system == 'Linux')" },
]
charliermarsh commented 6 days ago

Can you break this down a bit? It seems like you’re reporting multiple issues here - or am I misreading?

konstin commented 6 days ago

No i think this is one issue where we accumulate the wrong markers and then everything downstream is wrong

charliermarsh commented 6 days ago

Do you want to look into it?

charliermarsh commented 5 days ago

My guess is... maybe we aren't checking if the fork is disjoint with the parent fork? Like, after we fork, we do:

forked_state.markers.and(fork.markers);
forked_state.markers = normalize(forked_state.markers)
    .unwrap_or(MarkerTree::And(Vec::new()));

But that expression could be .

charliermarsh commented 5 days ago

Although in theory we filter out deps with possible_to_satisfy_markers...