The image size of a Docker Container adds up to 10GB, which is a lot for some Container Registries.
Is there any alternative to reduce the image size of the Container Image to less then 5 GB?
Because the Image what you are proving at the Docker Registry, is much smaller.
REPOSITORY TAG IMAGE ID CREATED SIZE
vllmtest5 latest d86273b9420d 2 minutes ago 7.9GB
How you are installing vllm
Dockerfile
FROM nvidia/cuda:11.8.0-base-ubuntu22.04 AS vllm-base
ARG VLLM_VERSION=0.4.2
ARG VLLM_PYTHON_VERSION=310
WORKDIR /vllm-workspace
RUN apt-get update -y \
&& apt-get install -y python3-pip \
&& apt-get clean && apt-get autoremove --yes \
&& rm -rf /tmp/* && rm -rf /var/lib/{apt,dpkg,cache,log}
# Workaround for https://github.com/openai/triton/issues/2507 and
# https://github.com/pytorch/pytorch/issues/107960 -- hopefully
# this won't be needed for future versions of this docker image
# or future versions of triton.
RUN ldconfig /usr/local/cuda-11.8/compat/
# install vllm wheel first, so that torch etc will be installed
RUN python3 -m pip install --upgrade pip
RUN pip install -vvv https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu118-cp${VLLM_PYTHON_VERSION}-cp${VLLM_PYTHON_VERSION}-manylinux1_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118 \
&& rm -Rf /root/.cache/pip \
&& python3 -m pip cache purge \
&& rm -rf /tmp/* \
#################### OPENAI API SERVER ####################
ENTRYPOINT ["python3", "-m", "vllm.entrypoints.openai.api_server"]
Your current environment
Hello, when the Python Wheel is installed according to your documentation: https://docs.vllm.ai/en/latest/getting_started/installation.html#install-with-pip
The image size of a Docker Container adds up to 10GB, which is a lot for some Container Registries. Is there any alternative to reduce the image size of the Container Image to less then 5 GB?
Because the Image what you are proving at the Docker Registry, is much smaller.
How you are installing vllm
Dockerfile