Closed bethac07 closed 1 month ago
(I strong suspect there is a better thing to do, but I don't understand pixi well enough to know what it is)
I'm not sure I understand the issue. The shell-hook is set as the entrypoint so you wouldn't need to manually prepend anything when using the run
command.
For example:
❯ cat Dockerfile
FROM ghcr.io/prefix-dev/pixi:0.31.0-jammy AS build
WORKDIR /src
RUN pixi init
RUN pixi add scikit-image
RUN pixi shell-hook > /shell-hook.sh
RUN echo 'exec "$@"' >> /shell-hook.sh
FROM ubuntu:jammy AS production
COPY --from=build /src/ /src/
COPY --from=build /src/.pixi/envs/default/ /src/.pixi/envs/default/
COPY --from=build /shell-hook.sh /shell-hook.sh
WORKDIR /src
ENTRYPOINT ["/bin/bash", "/shell-hook.sh"]
❯ docker run --rm -it test:0.0.1 env
HOSTNAME=7bb2cb519e8c
PIXI_PROMPT=(src)
PWD=/src
CONDA_PREFIX=/src/.pixi/envs/default
PIXI_PROJECT_MANIFEST=/src/pixi.toml
PIXI_PROJECT_NAME=src
HOME=/root
PIXI_ENVIRONMENT_NAME=default
PIXI_IN_SHELL=1
PIXI_EXE=/usr/local/bin/pixi
TERM=xterm
SHLVL=0
PIXI_PROJECT_VERSION=0.1.0
PIXI_PROJECT_ROOT=/src
CONDA_DEFAULT_ENV=src
PIXI_ENVIRONMENT_PLATFORMS=linux-64
PATH=/src/.pixi/envs/default/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
❯ docker run --rm -it test:0.0.1 which python
/src/.pixi/envs/default/bin/python
❯ docker run --rm -it test:0.0.1 python -m site
sys.path = [
'/src',
'/src/.pixi/envs/default/lib/python312.zip',
'/src/.pixi/envs/default/lib/python3.12',
'/src/.pixi/envs/default/lib/python3.12/lib-dynload',
'/src/.pixi/envs/default/lib/python3.12/site-packages',
]
USER_BASE: '/root/.local' (doesn't exist)
USER_SITE: '/root/.local/lib/python3.12/site-packages' (doesn't exist)
ENABLE_USER_SITE: True
❯ docker run --rm -it test:0.0.1 python -c "import skimage; print(skimage.__version__)"
0.24.0
Sorry, was making notes to myself and didn't end up getting back to them yesterday; was just coming here to update.
The issue isn't on regular container execution - the issue is the entrypoint isn't run when you FROM the container, so when you want to say, use a second container to FROM the first container and then pip install gradio
hypothetically, pip
isn't found.
I spent some time dorking around this morning with this and I suspect the solution falls in the realm of adding to the tool container something like SHELL ["/bin/bash", "/shell-hook.sh"]
, but I can't seem to find the right combo of shell flags to make things work quite right, I mostly get exec: not found
issues when trying to wrap the tool container in a simple dockerfile like below.
FROM instanseg_pixi:latest
RUN echo hello
RUN pip install plotly
It's not a great idea to pip install into pixi environments, so pip is removed in pixi. The approach to take would be to install the pixi binary and use it:
FROM instanseg_pixi:latest AS build
RUN apt-get update && apt-get install curl
RUN curl -fsSL https://pixi.sh/install.sh | bash
RUN source /root/.bashrc
RUN pixi add plotly # or pixi add --pypi plotly if you specifically want pypi packages instead of conda-forge
# regenerate the shell-hook.sh as before if doing a multistage build
It's not a great idea to pip install into pixi environments, so pip is removed in pixi
Nope, this works -
docker run --rm -it instanseg_pixi:latest pip install plotly
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
Collecting plotly
Downloading plotly-5.24.1-py3-none-any.whl.metadata (7.3 kB)
Collecting tenacity>=6.2.0 (from plotly)
Downloading tenacity-9.0.0-py3-none-any.whl.metadata (1.2 kB)
Requirement already satisfied: packaging in ./.pixi/envs/default/lib/python3.9/site-packages (from plotly) (24.1)
Downloading plotly-5.24.1-py3-none-any.whl (19.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.1/19.1 MB 19.1 MB/s eta 0:00:00
Downloading tenacity-9.0.0-py3-none-any.whl (28 kB)
Installing collected packages: tenacity, plotly
Successfully installed plotly-5.24.1 tenacity-9.0.0
That's a side-issue, though, because it's not that I want to put more stuff into the tool container as such; it's that the second stage of bilayers, the interface creation, needs access to a working copy of python to install the dependencies (gradio, jupyter, etc) during its build, which we'd previously said we wanted to be provided by the tool container and I think in general, we typically want provided by the tool's Python environment specifically, if it's a Python tool (otherwise tools that are like Instanseg that must be run with python file.py --cli-flag-1 etc
won't work). So we need the pixi enviornment active when the container is FROM
ed, which again I suspect we can do with SHELL
I just haven't gotten the magic right set of flags yet. Make more sense?
Nope, this works - docker run --rm -it instanseg_pixi:latest pip install plotly
Hmm. That's very weird and not reproducible in the general case:
❯ cat Dockerfile
FROM ghcr.io/prefix-dev/pixi:0.31.0-jammy AS build
WORKDIR /src
RUN pixi init
RUN pixi add scikit-image
RUN pixi shell-hook > /shell-hook.sh
RUN echo 'exec "$@"' >> /shell-hook.sh
FROM ubuntu:jammy AS production
COPY --from=build /src/ /src/
COPY --from=build /src/.pixi/envs/default/ /src/.pixi/envs/default/
COPY --from=build /shell-hook.sh /shell-hook.sh
WORKDIR /src
ENTRYPOINT ["/bin/bash", "/shell-hook.sh"]
❯ docker build -t test:0.0.1 --file Dockerfile .
❯ cat Dockerfile.extend
FROM test:0.0.1
RUN apt-get update && apt-get -y install curl
RUN curl -fsSL https://pixi.sh/install.sh | bash
SHELL ["/bin/bash", "-l", "-c"]
ENV PATH="/root/.pixi/bin:$PATH"
RUN pixi add plotly
RUN pixi shell-hook > /shell-hook.sh
RUN echo 'exec "$@"' >> /shell-hook.sh
ENTRYPOINT ["/bin/bash", "/shell-hook.sh"]
❯ docker build -t test-ext:0.0.1 --file Dockerfile.extend .
❯ docker run --rm -it test-ext:0.0.1 pip --version
/shell-hook.sh: line 14: exec: pip: not found # as it normally should be
needs access to a working copy of python to install the dependencies
That should not be a problem:
❯ docker run --rm -it test-ext:0.0.1 which python
/src/.pixi/envs/default/bin/python
So we need the pixi enviornment active when the container is
FROM
Again see above, I FROM
it in the second Dockerfile, and am able to install deps into the environment.
Yes, I see it's possible, by adding pixi back in, but I don't think that's what we want to do in interface creation, do we? In a perfect world, I don't think we want to have to track whether our tool containers needs Gradio-pixi vs Gradio-conda vs Gradio-pip; I think that adds a bunch of complexity. It might be unavoidable but I'd like to figure out a way to avoid it if we can.
I don't think any tinkering with the SHELL
instruction is going to do the trick. I have banged by head against that several times and it never does what I want. For instance you might think this would do the trick:
SHELL ["/bin/bash", "-c", "source /shell-hook.sh &&"]
But it does not, and similar things would not. I would absolutely love to be shown a working way to do that, but I don't know of a way to source a script that sets env variables before each RUN
command, without prepending them to the RUN
commands.
That's all incidental though. If we're not using pixi, then there's not much shell-hook
provides that's useful:
export PATH="/src/.pixi/envs/default/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
export CONDA_PREFIX="/src/.pixi/envs/default"
export PIXI_PROJECT_NAME="src"
export PIXI_PROJECT_VERSION="0.1.0"
export PIXI_EXE="/usr/local/bin/pixi"
export PIXI_IN_SHELL="1"
export PIXI_PROJECT_MANIFEST="/src/pixi.toml"
export PIXI_PROJECT_ROOT="/src"
export CONDA_DEFAULT_ENV="src"
export PIXI_ENVIRONMENT_NAME="default"
export PIXI_ENVIRONMENT_PLATFORMS="linux-64"
export PIXI_PROMPT="(src) "
exec "$@"
The only thing, agnostic of pixi the tool, is the modification to PATH
, and maybe CONDA_PREFIX
if we were to be using conda/mamba in the image:
export PATH="/src/.pixi/envs/default/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
export CONDA_PREFIX="/src/.pixi/envs/default"
So this approach works:
❯ cat Dockerfile
FROM ghcr.io/prefix-dev/pixi:0.31.0-jammy AS build
WORKDIR /src
RUN pixi init
RUN pixi add scikit-image
# any other installations
RUN pixi add pip
RUN pixi shell-hook > /shell-hook.sh
RUN echo 'exec "$@"' >> /shell-hook.sh
FROM ubuntu:jammy AS production
COPY --from=build /src/ /src/
COPY --from=build /src/.pixi/envs/default/ /src/.pixi/envs/default/
COPY --from=build /shell-hook.sh /shell-hook.sh
WORKDIR /src
ENTRYPOINT ["/bin/bash", "/shell-hook.sh"]
❯ cat Dockerfile.extend
FROM test:0.0.2
ENV PATH="/src/.pixi/envs/default/bin:$PATH"
RUN pip install plotly
I had pixi install pip in the base image, which is a bit icky, but meh.
It fixes the prepending issue, but of course, setting ENV
to append that specific path is still conditional on if we're extending a pixi image. Actually it's specific to the base image's workdir as well because of the /src/
.
But that begs the question, how would we avoid the issue in tools that use conda/mamba or virtualenv, where we have to somehow activate the environment beforehand? If we're going to be mostly wrapping other pre-built dockers then I'm not sure we can avoid some amount of conditioning on pip vs conda/mamba vs (perhaps more and more) pixi.
Allegedly, an automatic prepending seems to be possible with conda, I'm trying it now; this environment takes an hour to resolve though in non-mamba conda (blech) so tbd. But one thing that definitely DOES work is using conda env update
in the pytorch container's base
conda environment - see below.
(we should probably pair on this at this point rather than back-and-forth comments! lmk if today, tomorrow, etc is good. We should indeed try to come to some best-practices for the specific and also the general case. If we can figure out a generic strategy that works for activating environments (more likely, strategies - one for venv, one for conda, one for pixi), hopefully we can create lightweight strategies of 'if someone wants to use tool X which comes in a container that requires activation, we have documentation on lightweight Dockerfiles they can use to turn that into a tool container that doesn't need manual, which is then the thing that goes into the bilayers ci/cd'.)
(base) bcimini@wm4f8-761 instanseg_inference % docker run --rm -it instanseg_conda:latest pip install plotly
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
Collecting plotly
Downloading plotly-5.24.1-py3-none-any.whl.metadata (7.3 kB)
Collecting tenacity>=6.2.0 (from plotly)
Downloading tenacity-9.0.0-py3-none-any.whl.metadata (1.2 kB)
Requirement already satisfied: packaging in /opt/conda/lib/python3.10/site-packages (from plotly) (23.1)
Downloading plotly-5.24.1-py3-none-any.whl (19.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.1/19.1 MB 21.7 MB/s eta 0:00:00
Downloading tenacity-9.0.0-py3-none-any.whl (28 kB)
Installing collected packages: tenacity, plotly
Successfully installed plotly-5.24.1 tenacity-9.0.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
(base) bcimini@wm4f8-761 instanseg_inference % cat Dockerfile_from
FROM instanseg_conda:latest
#SHELL ["/bin/bash", "/shell-hook.sh"]
RUN echo hello
RUN pip install plotly
(base) bcimini@wm4f8-761 instanseg_inference % docker run --rm -it test_from:latest
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
root@6625919db332:/instanseg# which python
/opt/conda/bin/python
root@6625919db332:/instanseg# pip freeze | grep Ins
-e git+https://github.com/instanseg/instanseg.git@b9c9d6b6860f548a4b6bb9151270e390d9bfe52e#egg=InstanSeg
root@6625919db332:/instanseg# pip freeze | grep plot
matplotlib-inline @ file:///opt/conda/conda-bld/matplotlib-inline_1662014470464/work
plotly==5.24.1
Hmm. That's very weird
BTW this answers where pip came from.
Allegedly, an automatic prepending seems to be possible with conda
The same thing is doable with pixi, but only if pixi is actually available.
For the rest, we'll have to sync up. I might be wrong, but a skimming through the dockerfile I think the reason its working that way is because the instanseg
environment is created but never actually used. Instead everything is installed into base, and hence no activation would be needed.
In discussion, we figured out ways to solve things with setting both SHELL
and ENTRYPOINT
, especially because SHELL
propagates when that Docker is from'ed. Hooray!
Result of discussion. SHELL
is indeed the answer (on top of keeping the pixi binary around), as it propagates across docker builds.
❯ cat Dockerfile
FROM ghcr.io/prefix-dev/pixi:0.31.0-jammy
WORKDIR /src
RUN pixi init
RUN pixi add scikit-image
# any other installations
RUN pixi add pip
SHELL ["pixi", "run", "/bin/bash", "-c"]
ENTRYPOINT ["pixi", "run"]
❯ cat Dockerfile.extend
FROM test-ext:0.0.1
WORKDIR /src
RUN pip install plotly
Have you searched for similar issues?
What type of feature are you requesting?
Something else
Feature Details
Everything you want to run {EDITED TO ADD: in a second Dockerfile that you want to
FROM
the first Dockerfile} will need to be prepended with the shell-hook, which is a bummer :(Additional context
No response