prefix-dev / pixi

Package management made easy
https://pixi.sh
BSD 3-Clause "New" or "Revised" License
2.88k stars 156 forks source link

Facilitate docker 'layering' #1652

Open olivier-lacroix opened 1 month ago

olivier-lacroix commented 1 month ago

Problem description

To speed up docker image building, it is fairly typical to separate "things that change often", to "things that don't".

For dependency management, this usually means it is recommended to install dependencies in a docker layer first (as they do not change often), then install the local package(s) (as it changes at ~each build, so a lot more often).

A very simple python Dockerfile may look like

FROM python:3.12.2-slim

WORKDIR /app

COPY requirements.txt .

RUN pip install -r requirements.txt

COPY sample.py .

where requirements are installed before copying local code, to maximise cache hit.

It is not clear how something similar could be achieved with pixi currently. A couple of options that come to mind

  1. Not include local code in an environment, and use this environment to install dependencies. In that case, local code would need to be installed by something else than pixi, which does not feel ideal
  2. Have two environments, one 'dependencies-only' and one 'dependencies + local code' and start by installing one, then copy the code, then install the second one. this may be better, but feels like a bit of a hack, having two environments when we actually only need one, and pixi will do extra work to solve environment.

Another alternative (and better?) option could be extra options for pixi install. For instance --no-path-dependencies and --only-path-dependencies to respectively exclude path dependencies from solve & install, or deal only with them (ideally without doing any solve work, bypassing as many steps as possible to speed things up).

The Dockerfile may then look like

FROM python:3.12.2-slim

WORKDIR /app

COPY pixi.lock pixi.toml .

RUN pixi install --locked --no-path-dependencies

COPY sample-package . 

RUN pixi install --locked --only-path-dependencies
tdejager commented 1 month ago

Thanks for the issue, seems like something we should facilitate @pavelzw could you chime in here with your experience?

pavelzw commented 1 month ago

Hmm, I get the advantage of this, although i'm worrying a bit that this makes the CLI more complicated and cluttered. I can't think of any better solution than Olivier's so I guess this could make sense in pixi...

Side note: i've not really used path dependencies because of https://github.com/prefix-dev/pixi/issues/1046 and https://github.com/prefix-dev/pixi/issues/1340#issuecomment-2107093071


With the postinstall method that I am using until now, we can just ship around this problem:

FROM ghcr.io/prefix-dev/pixi

WORKDIR /app
COPY pixi.lock pixi.toml .
RUN pixi install --locked
COPY pyproject.toml my_app .
RUN pixi run postinstall

But since postinstall could be considered a hack as well, this might not be what we want.

olivier-lacroix commented 1 month ago

Thanks @pavelzw for your thoughts on this!

As additional background, looking at other tools:

conda and pip options are a bit different from what is proposed here, which can be explained by the fact they do not use lockfiles. Poetry's --no-directory option however feel very similar to what is proposed here.

ruben-arts commented 1 month ago

With pixi build in the back of my mind I think it would make sense to have some of these options available, even without the pypi workflow described in this Issue but including that.

I believe pixi install could be more feature full

I don't see why --no-deps would make sense but please enlighten me.

olivier-lacroix commented 3 weeks ago

@ruben-arts I agree on the lower usefulness of —no-deps. This tends to be done to bypass costly checks, and/or defer the installation of dependencies to a later step/stage.

olivier-lacroix commented 3 weeks ago

I am not sure using the same command pixi install for both environment level operation (as it is today) and package level is the best option; I feel it may be confusing.

Maybe a separate command for package level operation would be better ? e.g. pixi force or something like that? After all, these operations at likely outside of the « happy path ».