pypa / hatch

Modern, extensible Python project management
https://hatch.pypa.io/latest/
MIT License
5.68k stars 277 forks source link

[Feature Request] Dependency-specific index urls #1560

Open johnpyp opened 4 weeks ago

johnpyp commented 4 weeks ago

Some packages like pytorch recommend installing their packages through custom index urls, e.g from that page:

# To Install:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0

Though we could use url prioritization (with uv to make it consistent) for this, it would be better in this case to support dependency-scoped index urls, to avoid leaking the extra index url check to every other dependency as well, which introduces an unexpected supply chain scope to all dependencies as well.

Skypekey commented 3 weeks ago

Did you mean this? https://hatch.pypa.io/dev/config/dependency/#direct-references I think this is useful

johnpyp commented 3 weeks ago

I don't think so, as that's specifying an exact artifact to fetch from rather than the registry to resolve the given dependency from.

Skypekey commented 3 weeks ago

Oh, I understood. You need to specify a pypi source for certain modules. forgive my misunderstanding

polarathene commented 2 weeks ago

Just sharing my findings here if helpful.


PDM

PDM has a kinda nice way to approach this (falters if you want a project to support multiple PyTorch sources though):

[project]
name = "example"

dependencies = [
    "torch", # Implicitly resolves to `2.3.1+cu121` via configured PyTorch source below
    "torchvision",
    "torchaudio",
]
requires-python = ">=3.10"

[tool.pdm.resolution]
respect-source-order = true

[tool.pdm]
distribution = false

[[tool.pdm.source]]
name = "pytorch"
url = "https://download.pytorch.org/whl/cu121"
include_packages = ["torch", "torchvision", "torchaudio", "nvidia-*"]

The nvidia-* at the end there is to ensure that the torch deps resolve to the implicit nvidia-* packages from the torch index. Otherwise they'd come from PyPi, even though presently some of those were resolving to CUDA 12.5 instead of the intended and compatible CUDA 12.1 that these packages were intended to use.

That may be relevant context for you to keep in mind with your request to scope deps, as you may otherwise encounter that same caveat.


They also have optional dependency groups:

dependencies = [
    "torchvision",
    "torchaudio",
]

[project.optional-dependencies]
torch_cpu = ["torch==2.3.1+cpu"]
torch_cuda = ["torch==2.3.1+cu121"]

[[tool.pdm.source]]
name = "pytorch-cuda-12.1"
url = "https://download.pytorch.org/whl/cu121"
include_packages = ["torch_cuda", "torchvision", "torchaudio", "nvidia-*"]

[[tool.pdm.source]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
include_packages = ["torch_cpu", "torchvision", "torchaudio"]

You'd then run a command to specify the optional dep as a group like pdm install --group torch_cpu, however this won't work as expected due to the overlapping CUDA package source. You'd need to migrate the packages from dependencies to the optional-dependencies table with each group providing the explicit local identifier, which enforces the version pin like with torch (you cannot use >=).

If you don't specify the torchvision + torchaudio packages in each of the sources include_packages, then they'd resolve to the fallback PyPi default index package for resolution.

Each would need to maintain separate lock files with PDM too. You could workaround the pyproject.toml issues mentioned by using separate pyproject.toml files, for PDM at least it doesn't seem like there is interest to improve on the flexibility. There is a third-party plugin that provides an alternative way to configure torch deps (generates separate lock files).

The include_packages setting will bias the package to that source AFAIK, but other packages will attempt to resolve through indexes including these (unless explicitly excluding them). Priority seems to be predictable with respect-source-order = true by the order the source is declared in with PyPi as the default unless you have include_packages declared.


With Rye / Hatch

These are my observations so far at least 😅