bentoml / BentoML

The easiest way to serve AI apps and models - Build reliable Inference APIs, LLM apps, Multi-model chains, RAG service, and much more!
https://bentoml.com
Apache License 2.0
6.94k stars 774 forks source link

bug: Containerize fails when specifying cuda_version #2784

Closed akuma12 closed 2 years ago

akuma12 commented 2 years ago

Describe the bug

Using bentoml 1.0.0, when I try to specify cuda_version in my bentofile, I get the following output:

ubuntu@ip-10-41-138-217:~/bento_nsfw$ bentoml containerize nsfw-classifier:rvh6tmajhos7sg4c
Building docker image for Bento(tag="nsfw-classifier:rvh6tmajhos7sg4c")...
[+] Building 0.8s (7/7) FINISHED
 => [internal] load build definition from Dockerfile                                                                                         0.0s
 => => transferring dockerfile: 32B                                                                                                          0.0s
 => [internal] load .dockerignore                                                                                                            0.0s
 => => transferring context: 2B                                                                                                              0.0s
 => resolve image config for docker.io/docker/dockerfile:1.4-labs                                                                            0.6s
 => CACHED docker-image://docker.io/docker/dockerfile:1.4-labs@sha256:b50ad4af81d1c76ab7c0e1ffc216909e7adc23e99910243e1c88331c2a8ef52d       0.0s
 => [internal] load build definition from Dockerfile                                                                                         0.0s
 => [internal] load .dockerignore                                                                                                            0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.6.2-cudnn8-runtime-ubuntu20.04                                                     0.0s
error: failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to solve with frontend gateway.v0: rpc error: code = Unknown desc = invalid mount target "/"
Failed building docker image: Command '['docker', 'buildx', 'build', '--progress', 'auto', '--tag', 'nsfw-classifier:rvh6tmajhos7sg4c', '--file', 'env/docker/Dockerfile', '--load', '.']' returned non-zero exit status 1.

Here's my current bentofile.yaml:

service: "service:svc"
labels:
    owner: jim
    stage: dev
include:
  - service.py
  - "clip_autokeras_binary_nsfw/**/*"
  - ViT-L-14.pt
python:
    packages:
    - tensorflow
    - keras
    - autokeras
    - pillow
    - git+https://github.com/openai/CLIP.git
    extra_index_url:
    - "https://download.pytorch.org/whl/cu113"
docker:
    dockerfile_template: "./Dockerfile.template"
    cuda_version: "11.6.2"

If I remove cuda_version from the docker section, it containerizes fine (but obviously doesn't have GPU support). This occurs on my Mac as well as on an Ubuntu server. I searched the open and closed issues and couldn't find anything similar to this.

Thanks for any help you can provide!

To reproduce

No response

Expected behavior

No response

Environment

bentoml: 1.0.0 python: 3.8.10 platform: Linux-5.15.0-1015-aws-x86_64-with-glibc2.29 uid:gid: 1000:1000

pip_packages
``` absl-py==1.2.0 aiofiles==0.8.0 aiohttp==3.8.1 aiosignal==1.2.0 anyio==3.6.1 appdirs==1.4.4 asgiref==3.5.2 asttokens==2.0.5 astunparse==1.6.3 async-timeout==4.0.2 attrs==21.4.0 autokeras==1.0.19 Automat==0.8.0 backcall==0.2.0 bentoml==1.0.0 blinker==1.4 build==0.8.0 cachetools==5.2.0 cattrs==22.1.0 certifi==2019.11.28 chardet==3.0.4 charset-normalizer==2.1.0 circus==0.17.1 Click==7.0 clip @ git+https://github.com/openai/CLIP.git@4d120f3ec35b30bd0f992f5d8af2d793aad98d2a cloud-init==22.2 cloudpickle==2.1.0 colorama==0.4.3 command-not-found==0.3 commonmark==0.9.1 configobj==5.0.6 constantly==15.1.0 contextlib2==21.6.0 cryptography==2.8 dbus-python==1.2.16 decorator==5.1.1 deepmerge==1.0.1 Deprecated==1.2.13 distro==1.4.0 distro-info===0.23ubuntu1 ec2-hibinit-agent==1.0.0 entrypoints==0.3 exceptiongroup==1.0.0rc8 executing==0.8.3 filelock==3.7.1 flatbuffers==1.12 frozenlist==1.3.0 fs==2.4.16 ftfy==6.1.1 gast==0.4.0 google-auth==2.9.1 google-auth-oauthlib==0.4.6 google-pasta==0.2.0 grpcio==1.47.0 h11==0.13.0 h5py==3.7.0 hibagent==1.0.1 httplib2==0.14.0 hyperlink==19.0.0 idna==2.8 importlib-metadata==4.12.0 incremental==16.10.1 ipython==8.4.0 jedi==0.18.1 Jinja2==3.1.2 jsonpatch==1.22 jsonpointer==2.0 jsonschema==3.2.0 keras==2.9.0 Keras-Preprocessing==1.1.2 keras-tuner==1.1.3 keyring==18.0.1 kt-legacy==1.0.4 language-selector==0.1 launchpadlib==1.10.13 lazr.restfulclient==0.14.2 lazr.uri==1.0.3 libclang==14.0.1 Markdown==3.4.1 MarkupSafe==2.1.1 matplotlib-inline==0.1.3 more-itertools==4.2.0 multidict==6.0.2 netifaces==0.10.4 numpy==1.23.1 oauthlib==3.1.0 opentelemetry-api==1.9.0 opentelemetry-instrumentation==0.28b0 opentelemetry-instrumentation-aiohttp-client==0.28b0 opentelemetry-instrumentation-asgi==0.28b0 opentelemetry-sdk==1.9.0 opentelemetry-semantic-conventions==0.28b0 opentelemetry-util-http==0.28b0 opt-einsum==3.3.0 packaging==21.3 pandas==1.4.3 parso==0.8.3 pathspec==0.9.0 pep517==0.12.0 pexpect==4.6.0 pickleshare==0.7.5 Pillow==9.2.0 pip-tools==6.8.0 prometheus-client==0.13.1 prompt-toolkit==3.0.30 protobuf==3.19.4 psutil==5.9.1 pure-eval==0.2.2 pyasn1==0.4.2 pyasn1-modules==0.2.1 Pygments==2.12.0 PyGObject==3.36.0 PyHamcrest==1.9.0 PyJWT==1.7.1 pymacaroons==0.13.0 PyNaCl==1.3.0 pynvml==11.4.1 pyOpenSSL==19.0.0 pyparsing==3.0.9 pyrsistent==0.15.5 pyserial==3.4 python-apt==2.0.0+ubuntu0.20.4.7 python-dateutil==2.8.2 python-debian===0.1.36ubuntu1 python-dotenv==0.20.0 python-json-logger==2.0.4 python-multipart==0.0.5 pytz==2022.1 PyYAML==5.3.1 pyzmq==23.2.0 regex==2022.7.9 requests==2.22.0 requests-oauthlib==1.3.1 requests-unixsocket==0.2.0 rich==12.5.1 rsa==4.9 schema==0.7.5 SecretStorage==2.3.1 service-identity==18.1.0 simple-di==0.1.5 simplejson==3.16.0 six==1.14.0 sniffio==1.2.0 sos==4.3 ssh-import-id==5.10 stack-data==0.3.0 starlette==0.20.4 systemd-python==234 tensorboard==2.9.1 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 tensorflow==2.9.1 tensorflow-estimator==2.9.0 tensorflow-io-gcs-filesystem==0.26.0 termcolor==1.1.0 tomli==2.0.1 torch==1.12.0 torchvision==0.13.0 tornado==6.2 tqdm==4.64.0 traitlets==5.3.0 Twisted==18.9.0 typing_extensions==4.3.0 ubuntu-advantage-tools==27.9 ufw==0.36 unattended-upgrades==0.1 urllib3==1.25.8 uvicorn==0.18.2 wadllib==1.3.3 wcwidth==0.2.5 Werkzeug==2.1.2 wrapt==1.14.1 yarl==1.7.2 zipp==1.0.0 zope.interface==4.7.1 ```
aarnphm commented 2 years ago

Can you show us the dockerfile template you are currently using?

parano commented 2 years ago

@akuma12 in addition to the dockerfile template file, could you share the final generated Dockerfile?

You can find it via cat $(bentoml get nsfw-classifier:rvh6tmajhos7sg4c -o path)/env/docker/Dockerfile

akuma12 commented 2 years ago

For sure, thanks much. For full disclosure, I tried it without using a template and got the same result. I need git in order to install a Pip package that's only available on Github. Also, just to be sure I manually pulled down the nvidia/cuda image and that pulled fine.

Here's the Dockerfile.template:

{% extends bento_base_template %}
{% block SETUP_BENTO_BASE_IMAGE %}
{{ super() }}
RUN apt update && apt install -y git
{% endblock %}
{% block SETUP_BENTO_COMPONENTS %}
{{ super() }}
{% endblock %}

Here's the Dockerfile contents:

# syntax = docker/dockerfile:1.4-labs
#
# ===========================================
#
# THIS IS A GENERATED DOCKERFILE. DO NOT EDIT
#
# ===========================================

# Block SETUP_BENTO_BASE_IMAGE
FROM nvidia/cuda:11.6.2-cudnn8-runtime-ubuntu20.04

ENV LANG=C.UTF-8

ENV LC_ALL=C.UTF-8

ENV PYTHONIOENCODING=UTF-8

ENV PYTHONUNBUFFERED=1

USER root

ENV DEBIAN_FRONTEND=noninteractive
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,target=/var/lib/apt \
    --mount=type=cache,target=/var/cache/apt \
    apt-get update -y \
    && apt-get install -q -y --no-install-recommends --allow-remove-essential \
        ca-certificates gnupg2 bash build-essential

RUN --mount=type=cache,target= \
    --mount=type=cache,target= bash <<EOF
set -euxo pipefail

apt-get update -y
apt-get install -y --no-install-recommends --allow-remove-essential \
        software-properties-common curl

# add deadsnakes ppa to install python
add-apt-repository ppa:deadsnakes/ppa
apt-get update -y

apt-get install -y --no-install-recommends --allow-remove-essential \
python3.8 \
        python3.8-dev \
        python3.8-distutils
EOF

RUN ln -sf /usr/bin/python3.8 /usr/bin/python3 && \
    ln -sf /usr/bin/pip3.8 /usr/bin/pip3

RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
    python3 get-pip.py && \
    rm -rf get-pip.py

RUN apt update && apt install -y git

# Block SETUP_BENTO_USER
ARG BENTO_USER=bentoml
ARG BENTO_USER_UID=1034
ARG BENTO_USER_GID=1034
RUN groupadd -g $BENTO_USER_GID -o $BENTO_USER && useradd -m -u $BENTO_USER_UID -g $BENTO_USER_GID -o -r $BENTO_USER

SHELL [ "/bin/bash", "-eo", "pipefail", "-c" ]

ARG BENTO_PATH=/home/bentoml/bento
ENV BENTO_PATH=$BENTO_PATH
ENV BENTOML_HOME=/home/bentoml/

RUN mkdir $BENTO_PATH && chown bentoml:bentoml $BENTO_PATH -R
WORKDIR $BENTO_PATH

# init related components
COPY --chown=bentoml:bentoml . ./

# Block SETUP_BENTO_COMPONENTS

# Running install.sh to install python packages
RUN --mount=type=cache,target=/root/.cache/pip bash <<EOF
set -euxo pipefail

if [ -f /home/bentoml/bento/env/python/install.sh ]; then
  echo "install.sh to install python packages..."
  chmod +x /home/bentoml/bento/env/python/install.sh
  /home/bentoml/bento/env/python/install.sh
fi
EOF

# Running user setup scripts
RUN  bash <<EOF
set -euxo pipefail

if [ -f /home/bentoml/bento/env/docker/setup_script ]; then
  echo "user setup scripts..."
  chmod +x /home/bentoml/bento/env/docker/setup_script
  /home/bentoml/bento/env/docker/setup_script
fi
EOF

# Block SETUP_BENTO_ENTRYPOINT
RUN rm -rf /var/lib/{apt,cache,log}
# Default port for BentoServer
EXPOSE 3000

RUN chmod +x /home/bentoml/bento/env/docker/entrypoint.sh

USER bentoml

ENTRYPOINT [ "/home/bentoml/bento/env/docker/entrypoint.sh" ]

CMD [ "bentoml", "serve", "/home/bentoml/bento", "--production" ]
aarnphm commented 2 years ago

With additional system packages you can include the library under system_packages:

docker:
  system_packages:
    - git

With regarding the generated dockerfile I will create a PR to fix this.

aarnphm commented 2 years ago

The issue will be tracked there and fixed when we release 1.0.1. Thanks for spotting this 😄