astral-sh / uv

An extremely fast Python package installer and resolver, written in Rust.
https://astral.sh/
Apache License 2.0
12.84k stars 357 forks source link

Support for platform independent lockfile and SHA-pinning #2679

Open DetachHead opened 3 months ago

DetachHead commented 3 months ago

re-raising https://github.com/astral-sh/rye/issues/881

Hi there, thank you for the great effort put into this amazing project.

I'm asking whether you plan in the future (maybe when uv support won't no longer be experimental) to overcome the current limitations in order to being able to:

* Generate a platform-indipendent lockfile

* Pin the dependency version by SHA. You can already do that with pip-tools (option `--generate-hashes`), but apparently this is not available inside the rye CLI

* Check for outdated dependencies (like you can do with pip: `pip list --outdated`)
KotlinIsland commented 3 months ago

Poetry and PDM support this feature, I think that it is essential for managing any kind of project that will be developed/run on different platforms.

charliermarsh commented 3 months ago

Yes, we absolutely plan to support this.

considerate commented 2 months ago

@charliermarsh Are you planning on creating a new format or do you plan to export to poetry.lock or pdm.lock?

I would personally be very grateful if you would target poetry.lock since this enables interoperability with nix using poetry2nix.

inoa-jboliveira commented 2 months ago

Right now I am having to generate the dependencies for each environment (windows, linux, mac) on different machines when running uv pip compile and it is time consuming plus very difficult for developers to change any dependency.

Instead of platform independent lock file, I believe it could be simpler to just allow uv to receive the platform parameters and build the file to the specified platform.

Something like

uv pip compile --target windows-x86_64
uv pip compile --target linux-i686
up pip compile --target mac-aarch64
considerate commented 2 months ago

I think the suggestion by @inoa-jboliveira would be a great intermediate step to getting cross-platform support for lock files. I would like to suggest to using the first two parts of the target triplet (i.e cpu-vendor rather than the above vendor-cpu).

That would yield:

uv pip compile --target x86_64-windows
uv pip compile --target i686-pc-linux
up pip compile --target aarch64-darwin
charliermarsh commented 2 months ago

Yeah we've considered / are considering such designs but we're unlikely to implement anything like that until we've made a holistic decision on how we want to handle cross-platform locking.

charliermarsh commented 2 months ago

For what it's worth, there's extensive discussion around similar ideas here: https://discuss.python.org/t/lock-files-again-but-this-time-w-sdists/46593/1.

charliermarsh commented 2 months ago

(I believe we could support a --target thing pretty easily. It turns out that a target triple isn't nearly enough to fully specify a "Python platform" given the markers that Python exposes in dependency resolution, but we would probably pick some "prototypical" set of environment markers based on the triple. The resolution wouldn't be guaranteed to work on all x86_64-linux machines, since e.g. the glibc version could differ, but in reality people are already relying on those kinds of resolutions being portable given that they're locking on one machine and using the files on others.)

inoa-jboliveira commented 2 months ago

Not only it should be much simpler to support a target environment (since you kinda do that right now selecting the current environment) but it does not restrict you from later on having multiple targets so files can be portable cross platform.

In terms of adoption of uv, I believe it is better to provide the functionality and later improve it than to be a blocker for many people. I'm pushing hard to replace all my tooling stack with uv on a large project.

jbcpollak commented 2 months ago

a --target solution would be a good interim solution for our usecase.

charliermarsh commented 2 months ago

Yeah I may hack on it when I have some time.

paco-sevilla commented 2 months ago

Hi all! FWIW, I wanted to mention that at the company where I work, we use pip-compile-cross-platform to get a single requirements.txt file that works on all platforms. It uses poetry in the background, so it's extremely slow...

It would be awesome if uv could generate an output like that (instead of a requirements.txt targeting a single platform).

charliermarsh commented 2 months ago

I've prototyped this here: https://github.com/astral-sh/uv/pull/3111

charliermarsh commented 2 months ago

The next version will support --platform on uv pip compile: https://github.com/astral-sh/uv/pull/3111.

charliermarsh commented 2 months ago

I'm not going to close this though since the broader task here remains.

inoa-jboliveira commented 2 months ago

@charliermarsh Thank you, looks very nice! Is it going to be released soon?

charliermarsh commented 2 months ago

It will probably go out today.

hauntsaninja commented 2 months ago

https://github.com/astral-sh/uv/pull/3111 is awesome and makes getting a "good enough" cross-platform lock file trivial.

Dinky little example script:

import argparse
import os
import subprocess
import sys
import tempfile
from pathlib import Path

from packaging.markers import Marker
from packaging.requirements import Requirement

def parse_requirements_txt(req_file: str) -> list[str]:
    def strip_comments(s: str) -> str:
        try:
            return s[: s.index("#")].strip()
        except ValueError:
            return s.strip()

    entries = []
    with open(req_file) as f:
        for line in f:
            entry = strip_comments(line)
            if entry:
                entries.append(entry)
    return entries

def marker_to_uv(key: str, value: str) -> list[str]:
    if key == "sys_platform":
        if value == "linux":
            uv_platform = "linux"
        elif value == "darwin":
            uv_platform = "macos"
        elif value == "win32":
            uv_platform = "windows"
        else:
            raise ValueError(f"Unknown sys_platform {value}")
        return ["--python-platform", uv_platform]
    if key == "python_version":
        return ["--python-version", value]
    raise ValueError(f"Cannot convert marker {key} to uv input")

Environment = dict[str, str]

def env_to_marker(environment: Environment) -> Marker:
    return Marker(" and ".join(f"({k} == {repr(v)})" for k, v in environment.items()))

def cross_environment_lock(src_file: Path, environment_matrix: list[Environment]) -> str:
    cmd = ["uv", "pip", "compile", "--python", sys.executable, "--no-header", "--no-annotate"]
    with tempfile.TemporaryDirectory() as tmpdir:
        joined: dict[Requirement, list[Environment]] = {}
        for i, environment in enumerate(environment_matrix):
            out_file = os.path.join(tmpdir, src_file.stem + "." + str(i))
            env_cmd = cmd + [src_file, "--output-file", out_file]
            for key, value in environment.items():
                env_cmd.extend(marker_to_uv(key, value))
            subprocess.check_call(env_cmd, stdout=subprocess.DEVNULL)

            for r in parse_requirements_txt(out_file):
                joined.setdefault(Requirement(r), []).append(environment)

        common = [r for r, envs in joined.items() if len(envs) == len(environment_matrix)]
        cross_environment = [
            (r, envs) for r, envs in joined.items() if len(envs) != len(environment_matrix)
        ]
        cross_environment.sort(key=lambda r: r[0].name)

        output = [str(r) for r in common]
        for req, environments in cross_environment:
            req = Requirement(str(req))  # make a copy
            joint_marker = Marker(" or ".join(f"({env_to_marker(env)})" for env in environments))
            if req.marker is None:
                req.marker = joint_marker
            else:
                # Note that uv currently doesn't preserve markers, so this branch is unlikely
                # https://github.com/astral-sh/uv/issues/1429
                req.marker._markers = [req.marker._markers, "and", joint_marker._markers]
            output.append(str(req))

        return "\n".join(output)

def main() -> None:
    parser = argparse.ArgumentParser()
    parser.add_argument("src_file")
    parser.add_argument("--output-file")
    args = parser.parse_args()

    output = cross_environment_lock(
        Path(args.src_file),
        [
            {"sys_platform": "linux"},
            {"sys_platform": "darwin", "python_version": "3.10"},
        ],
    )
    print(output)
    if args.output_file:
        with open(args.output_file, "w") as f:
            f.write(output)

if __name__ == "__main__":
    main()
charliermarsh commented 2 months ago

Oh wow, that's cool. So you do multiple resolutions, then merge the requirements.txt and gate the entries from each resolution with a marker for that platform?

hauntsaninja commented 2 months ago

Yup, uv's fast enough that resolving multiple times is totally fine.

Unsolicited thoughts on what I want from a lock file:

  1. Restrict to a finite set of files 1a. The finite set of files should be reasonably minimal 1b. Finite here is defined as fixed / close-ended (should refer to the same set of files tomorrow)
  2. Installing in the same marker environment always gets me the same result (modulo insane sdists)
  3. Should be guaranteed to work in marker environments I explicitly ask for
  4. Has a reasonable chance of generalising to other unknown marker environments 4a. I get a good error if running installing to some other marker environment but it didn't generalise and the resolution is wrong

With the above:

The best way I see to get 4/4a is you probably want something more like "record the input requirements, use the lock file as a source of constraints, at install time re-resolve the input requirements using the lock file constraints". uv exposes almost all the building blocks you'd need, I think the only thing missing is a few missing features from constraint files (e.g. interactions between constraints and hashes, express an "or" constraint, lmk if details are useful)

considerate commented 2 months ago

@hauntsaninja In addition to the above list of nice properties to have in a lock file I'd like to add:

  1. The lock file should store enough information so that no access to any PyPI index needs to be performed to determine where to find the package and which file corresponds to what hash.

    • One way of supporting this would be to add URL annotations to the lock file like described in

      3034

    • This is important to guarantee reproducible builds. Without this information, a build now relies on the state of the remote PyPI server at time of building to figure out which hash the final package should resolve to.
  2. The lock file should (optionally) be stored in a format that doesn't require specialized parsing to figure out the structure of the lock file

    • This would imply defining a format using a commonly available serialization format such as JSON or TOML
brettcannon commented 1 week ago

Unsolicited thoughts on what I want from a lock file:

  1. Restrict to a finite set of files 1a. The finite set of files should be reasonably minimal 1b. Finite here is defined as fixed / close-ended (should refer to the same set of files tomorrow)
  2. Installing in the same marker environment always gets me the same result (modulo insane sdists)
  3. Should be guaranteed to work in marker environments I explicitly ask for
  4. Has a reasonable chance of generalising to other unknown marker environments 4a. I get a good error if running installing to some other marker environment but it didn't generalise and the resolution is wrong

That should all be supported in the PEP I have coming (just got back from parental leave, but the draft PEP is written and just waiting on me to write a PoC which requires adapting some pre-existing code; the draft was also sent to @charliermarsh on discuss.python.org privately in March). I will be posting to https://discuss.python.org/c/packaging/14 once the PEP is ready.

zanieb commented 1 week ago

For those following, https://github.com/astral-sh/uv/issues/3347 is tracking the lock file work in progress.