triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.23k stars 1.47k forks source link

Build error when building new image on top of the `nvcr.io/nvidia/tritonserver:24.04-py3-sdk` container image from NGC #7243

Open jackylu0124 opened 5 months ago

jackylu0124 commented 5 months ago

Description I am trying to build my own Triton client container image based off the nvcr.io/nvidia/tritonserver:24.04-py3-sdk container image, in which I install additional packages with pip. The issue arises when I tried to install Flask, and I encountered the following error logs:

2.793 Installing collected packages: MarkupSafe, itsdangerous, click, blinker, Werkzeug, Jinja2, Flask
2.879   Attempting uninstall: blinker
2.880     Found existing installation: blinker 1.4
2.881 ERROR: Cannot uninstall 'blinker'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

and upon further debugging, I found that the blinker package is required by Flask, and then I also found that there's python3-blinker package that's managed by apt-get, which I believe is what "a distutils installed project" in the build error log is referring to. In other words, I think we encountered this error log due to the conflicting presence of the blinker packages in both places, managed by different package managers.

I have also done some investigations online, and people mostly suggest the following two workarounds, or a combination of the two:

  1. Add the RUN apt-get remove -y python3-blinker step before installing the requirements.txt file with pip (see the updated Dockerfile below).
    
    # syntax=docker/dockerfile:1

FROM nvcr.io/nvidia/tritonserver:24.04-py3-sdk AS base

RUN apt-get update -y

RUN apt-get remove -y python3-blinker

COPY requirements.txt requirements.txt RUN pip3 install -r requirements.txt

FROM base as primary

Build command: docker build --rm --target=primary -t my_triton_client .

2. Add the `RUN pip3 install --ignore-installed --no-cache-dir blinker` step before installing the `requirements.txt` file with pip (see the updated Dockerfile below).

syntax=docker/dockerfile:1

FROM nvcr.io/nvidia/tritonserver:24.04-py3-sdk AS base

RUN apt-get update -y

RUN pip3 install --ignore-installed --no-cache-dir blinker

COPY requirements.txt requirements.txt RUN pip3 install -r requirements.txt

FROM base as primary

Build command: docker build --rm --target=primary -t my_triton_client .


I think either one of them alone should work for the time being as a workaround, but I don't know which one is better and less "hacky" and also with less side-effects, because I see [https://stackoverflow.com/questions/53807511/pip-cannot-uninstall-package-it-is-a-distutils-installed-project](https://stackoverflow.com/questions/53807511/pip-cannot-uninstall-package-it-is-a-distutils-installed-project) discussing the potential issues of using the `--ignore-installed` flag in the `RUN pip3 install --ignore-installed --no-cache-dir blinker` command since we are essentially overwriting the original installation; but on the other hand, `python3-blinker` is a system package ([https://github.com/googlecolab/colabtools/issues/3976](https://github.com/googlecolab/colabtools/issues/3976)), and I don't know whether removing the `python3-blinker` package with the `RUN apt-get remove -y python3-blinker` command would introduce any unexpected side-effects.

**Therefore, in addition to reporting this issue, I would like to ask for your suggestions on what I should do in order to resolve the build issue. Thank you very much for your time and help in advance!**

Reference links on the similar issues I found online:
1. [https://stackoverflow.com/questions/53807511/pip-cannot-uninstall-package-it-is-a-distutils-installed-project](https://stackoverflow.com/questions/53807511/pip-cannot-uninstall-package-it-is-a-distutils-installed-project)
2. [https://stackoverflow.com/questions/49932759/pip-10-and-apt-how-to-avoid-cannot-uninstall-x-errors-for-distutils-packages](https://stackoverflow.com/questions/49932759/pip-10-and-apt-how-to-avoid-cannot-uninstall-x-errors-for-distutils-packages)
4. [https://stackoverflow.com/questions/50421287/pip-cannot-uninstall-ipython-it-is-a-distutils-installed-project-and-thus-w](https://stackoverflow.com/questions/50421287/pip-cannot-uninstall-ipython-it-is-a-distutils-installed-project-and-thus-w)
5. [https://github.com/embedchain/embedchain/issues/506](https://github.com/embedchain/embedchain/issues/506)
6. [https://github.com/googlecolab/colabtools/issues/3976](https://github.com/googlecolab/colabtools/issues/3976)
7. [https://github.com/rsmusllp/king-phisher/issues/300](https://github.com/rsmusllp/king-phisher/issues/300)
8. [https://forums.developer.nvidia.com/t/error-when-doing-docker-build-for-jetson-inference/288085](https://forums.developer.nvidia.com/t/error-when-doing-docker-build-for-jetson-inference/288085)

**Triton Information**

> What version of Triton are you using?

I am using Triton version 2.45.0 and its corresponding container image `nvcr.io/nvidia/tritonserver:24.04-py3-sdk` on NGC.

> Are you using the Triton container or did you build it yourself?

I am using the `nvcr.io/nvidia/tritonserver:24.04-py3-sdk` container image from NGC.

**To Reproduce**
I have included both the `Dockerfile` and also the `requirements.txt` in the attached zipped file. I am also pasting the content of both files below for convenience. To reproduce the build error, I think you just have to run the `docker build --rm --target=primary -t my_triton_client .` command in the directory containing the `Dockerfile` and then you will see the build error logs shown above.

`Dockerfile`:

syntax=docker/dockerfile:1

FROM nvcr.io/nvidia/tritonserver:24.04-py3-sdk AS base

RUN apt-get update -y

COPY requirements.txt requirements.txt RUN pip3 install -r requirements.txt

FROM base as primary

Build command: docker build --rm --target=primary -t my_triton_client .


`requirements.txt`:

Flask==3.0.3



**Expected behavior**
The build error should not happen when I install additional packages (e.g. `Flask`) on top of the `nvcr.io/nvidia/tritonserver:24.04-py3-sdk` container image. I tried installing `Flask` in the same workflow on top of the `nvcr.io/nvidia/tritonserver:23.03-py3-sdk` container image, and there was no build error during the image building process.
mc-nv commented 5 months ago

I believe issue is pretty straight forward. If customer facing issue with installed products, he can modify image per his own needs but no support for customization will be provided.

I haven't found issue with following instructions within nvcr.io/nvidia/tritonserver:24.04-py3-sdk container:

apt remove -y python3-blinker python-blinker-doc
pip3 install Flask
jackylu0124 commented 5 months ago

Hi @mc-nv , thank you very much for your replies and insights!

  1. Could you elaborate on what you mean by the "instructions within nvcr.io/nvidia/tritonserver:24.04-py3-sdk container" and where can I find that instructions?
  2. Also could you please explain why we need to remove python-blinker-doc in addition to python3-blinker?
  3. Is using apt remove -y python3-blinker python-blinker-doc preferable to doing pip3 install --ignore-installed --no-cache-dir blinker?

Thanks a lot for your time and help again!

jackylu0124 commented 5 months ago

Hi @mc-nv , a quick follow-up to this issue, I ran apt list --installed and also dpkg -l in the nvcr.io/nvidia/tritonserver:24.04-py3-sdk container, but I can only find the python3-blinker package and cannot find the python-blinker-doc package.

  1. Could you please explain why we need to remove python-blinker-doc in addition to python3-blinker?
  2. Could you elaborate on what you mean by the "instructions within nvcr.io/nvidia/tritonserver:24.04-py3-sdk container" and where can I find that instructions? I couldn't find the instructions inside the container, could you please let me know where it is located?
  3. Is using apt remove -y python3-blinker python-blinker-doc preferable to doing pip3 install --ignore-installed --no-cache-dir blinker?

Thanks a lot for your time and help again!

jackylu0124 commented 4 months ago

Hi @mc-nv, a quick follow-up to this issue. Thank you!

mc-nv commented 4 months ago

Sorry, I have nothing to add on this issue.

Given error is indicates, that you can't update packages installed with apt using pip. There why you have to remove it first.

jackylu0124 commented 4 months ago

Hi @mc-nv , thank you for your reply!

  1. I understand that we have to remove the package due to the error, but my question is why we have to remove python-blinker-doc in addition to python3-blinker instead of just removing python3-blinker? Could you also explain where the python-blinker-doc package is in the container because I cannot find it by inspecting the results after running apt list --installed and also dpkg -l.

  2. Could you elaborate on what you mean by the "instructions within nvcr.io/nvidia/tritonserver:24.04-py3-sdk container" and where can I find that instructions? I couldn't find the instructions inside the container, could you please let me know where it is located?

  3. Is using apt remove -y python3-blinker python-blinker-doc preferable to doing pip3 install --ignore-installed --no-cache-dir blinker?

Thank you for your time and help again!

jackylu0124 commented 4 months ago

A quick followup on the qustions above, thanks!

jackylu0124 commented 3 months ago

A quick followup on the qustions above, thanks!