tdejager commented 6 months ago

Problem description

The current implementation uses the conda environments as a base environments for the building of python packages source distributions. It provides the python interpreter to the uv solve that pixi triggers when solving pypi-dependencies. Even without needing to install an environment, if said environment contains pypi-dependencies the conda prefix for that environment still needs to be installed, this is because we do not know before-hand if there are any source dists that need to be built.

This results in multiple downsides:

Slow multi environment solve step. https://github.com/prefix-dev/pixi/issues/1046
Unable to solve when the current platform doesn't support the full conda environment. https://github.com/prefix-dev/pixi/issues/1130
It can blow up your system usage as you install all environments and not just the one you are actually using. LINK ISSUE
We cannot support a --no-install or a lock (that does no install) command when an environment contains pypi-dependencies https://github.com/prefix-dev/pixi/issues/1131

Proposal

Give the user the ability to create a custom build environment for the pypi-dependencies which only contains a minimal set of requirements which can be installed on all systems. For example only python, skipping all other dependencies. This also allows for more options, like installing specific dependencies for building a package, and in conjunction with the addition of https://github.com/prefix-dev/pixi/issues/1124 can avoid having to use pypi-dependencies for building a source dist altogether.

The following project can only be solved on a linux-64 machine:

[project]
name = "project"
platforms = ["linux-64", "osx-arm64"]
channels = ["conda-forge"]

[dependencies]
python = "3.12"

[feature.cuda]
platforms = ["linux-64"]
system-requirements = { cuda = "12.2" }

[feature.cuda.dependencies]
cuda = "*"

[pypi-dependencies]
local = {path = ".", editable = true}

[environments]
cuda = ["cuda"]

Specification in manifest

Option 1: Add build environment to environment

[feature.build.dependencies]
python = "3.10"
hatchling = "*"

[environments]
# define a specific build environment
python-build-env = { features =  ["build"], no-default-feature = true }
# Use said build environment in other environments
cuda = {features = ["cuda"], pypi-build-environment = "python-build-env"}

Pros:

Reuse a lot of the existing logic for environments.
Very explicit.

Cons:

You probably want no-default-feature = true .
Needs to specify a build environment per python interpreter.
If you forget to add the field explicitly per environment you automatically get the problems back.
Blowup lock file

Option 2: A pypi-build-environment table

[pypi-build-environment]
# Define which dependencies you want to reuse from the feature itself
reuse = ["python", "sdl2"]
# Define the dependencies you need to add in the build environment
dependencies = {hatch = "*"}

[dependencies]
sdl2 = "*"

[feature.py310]
dependencies = { python = "3.10"}

[environments]
cuda = {features = ["cuda", "py310"]}
py310 = ["py310"]

Pros:

Possibly only define it once in the default feature
The ability to specialize it in other features
Once you define the [pypi-build-environment] it is automatically inherited in all other environments.

Cons:

Adding another complex field to the manifest
Needs extra logic to override dependencies in features.
Blowup lock file

Option 3: Use host-dependencies

[host-dependencies]
python = {version = "3.10"}
sdl2 = {version = "*"}
pytest = "*"

[features.foo.host-dependencies]
sdl2 = {version = "2.1"}

This would traverse all environments and figure out the unique and re-usable build environments per environment.

Pros:

Quite simple specification

Cons:

How to handle system-requirements, sometimes you might want them in the build-env, but in the case of cuda you might not.
Might be difficult to find-out what is included in the environment by looking at the manifest.
This might be weird if it has two meanings for a possible pixi build although you will probably want those dependencies in that case

How is this going to be backwards compatible?

The current behavior doesn't change, so a pypi build environment will be an opt-in feature, as for simple use-cases it does work.

Alternative solutions to the described problems

There are some alternatives:

Configure features to be binary-only, this would allow only wheel files and no python interpreter is needed for installation. There are challenges:
1. When a feature has an editable you need a python interpreter and this would require installing the prefix once again.
2. Might be pretty slow when installing a large conda-prefix that would not be needed for the build
3. You run into the limitation of not having sdists in the real world pretty quickly, binaries are already preferred by uv so are eagerly selected.
Do a double solve, first binary-only and in case of failure install the prefix and resolve with sdist support.
- 1 and 3 from above hold
  1. You might get different behavior when running the solve from one day to the next.
Allow granular locking per-environment

We do think that wheel-only is a good idea nonetheless that we would want to implement.

pavelzw commented 6 months ago

I find 1. a bit verbose if you need to add it to every environment. If we find a good way to incorporate the host dependencies with pixi build, I would be in favor of that. My use cases (use conda packages for everything, only use uv for doing the editable install of .) could look as follows:

Building a library:

[host-dependencies] # or [pypi-build-environment]
python = "*"
hatchling = "*"

[pypi-dependencies]
polarify = {path = ".", editable = true, ignore-dependencies = true, build-isolation = false}

[environments]
default = ["test"]
pl014 = ["pl014", "py39", "test"]
pl015 = ["pl015", "py39", "test"]
# ...

here, ignore-dependencies results in the python interpreter not needing to be installed during solve time (dependencies of . are not added to the lockfile anyway) For building the wheel, we use a separate host environment which contains only python and hatchling. hatchling is also not contained in the default, pl014, ... environments anymore (different (but imo better) to how it's working now)

Building an application:

[host-dependencies] # or [pypi-build-environment]
python = "*"
hatchling = "*"

[feature.dev.pypi-dependencies] # or [pypi-dependencies]
polarify = {path = ".", editable = true, ignore-dependencies = true, build-isolation = false}

[feature.prod.pypi-dependencies]
polarify = {path = ".", editable = false, ignore-dependencies = true, build-isolation = false}

[environments]
default = { features = ["dev"], solve-group = "prod" }
prod = { features = ["prod"], solve-group = "prod" }

here, hatchling also isn't in the prod and default environment as well and the wheel is built inside the specific host environment.

ruben-arts commented 2 weeks ago

Design proposal PyPI build environment

The goal for the users is to make the solves faster and possible on more systems. The old projects should stay working like they used to.

We'll do this by allowing a user to define a pypi build environment.

The manifest

`pixi.toml`

# Add a table to specify the build dependencies used in the pypi solve.
[pypi-build-dependencies]
python = "*"

# You can define it per feature
[feature.cpp.pypi-build-dependencies]
compilers = "*"

# You can define it per target
[target.linux.pypi-build-dependencies]
python = {build_number = "1"}

[feature.py39.dependencies]
python = "3.9"
boltons = "*"

[pypi-dependencies]
pytest = "*"

[environments]
default = ["default"]
cpp = ["default", "cpp"]
py39 = {no-default-features = true, features = ["py39"]}

`pyproject.toml`

[project]
name = "project"
requires-python = ">=3.10"
dependencies = ["pytest", "numpy"]

[tool.pixi.project]
channels = ["conda-forge"]
platforms = ["linux-64"]

[tool.pixi.dependencies]
pytorch = "*"

[tool.pixi.pypi-build-dependencies]
# Automatically inherit the python dependency version from the default env, which is the minimal version of all platforms solved for.
# Could break if different version of python in different platforms are used, we could go around this by installing a build-env per platform.
python = "*"
# Specifically install this version of compilers.
compilers = "1.2.3"
# Inherit from default env if version matches matchspec, otherwise install using matchspec.
cmake = ">=3.23"

Full Steps

The code logic will consist of the following steps:

Solve all conda envs
Per solve-group/environment:
- If a pypi-build-dependencies is defined, try to use that.
  - If not, use old logic
- Find versions we need to inherit for the pypi build dependencies.
- Install pypi-build-dependencies environment(if possible, just try to solve) on my machine (no lock)
  - If it already exists and satisfies the dependencies, reuse.
  - If not possible to install, still error for now.
- Solve pypi-dependencies using that environment.
  - Inject the Python version meta-data of the solved conda-environment.
  - Inject the interpreter of the pypi-build-dependencies
- Push them into the lock
- Install using current behavior.
  - Don't use pypi-build-dependencies env after locking.

Edge-case

If you can't install the pypi-build-dependencies on current platform, this is something we can't solve yet. We're thinking about allowing the users to specify all required meta-data or allowing for an "unsolved" environment in your lockfile.

tdejager commented 2 weeks ago

Addition to the Edge-case would be that we took that idea from uv: https://docs.astral.sh/uv/concepts/resolution/#dependency-metadata. But because of conda we would need it less.

prefix-dev / pixi

Build Environment for PyPI Source Dependencies #1340

Problem description

Proposal

Specification in manifest

Option 1: Add build environment to environment

Option 2: A pypi-build-environment table

Option 3: Use host-dependencies

How is this going to be backwards compatible?

Alternative solutions to the described problems

Design proposal PyPI build environment

The manifest

`pixi.toml`

`pyproject.toml`

Full Steps

Edge-case