aws / deep-learning-containers

AWS Deep Learning Containers are pre-built Docker images that make it easier to run popular deep learning frameworks and tools on AWS.
https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/what-is-dlc.html
Other
1k stars 461 forks source link

[feature] Python 3.12 support for inf1 image with torch 2.x #4345

Open rantoniuk opened 1 day ago

rantoniuk commented 1 day ago

Do you plan to add support for Python 3.12 to the available list of images? I am specifically interested in this inference AMI for an inf1 instance type:

public.ecr.aws/neuron/pytorch-inference-neuron:1.13.1-neuron-py310-sdk2.20.0-ubuntu20.04

rantoniuk commented 5 hours ago

I've spent some time trying to figure out and update the dependencies and here's what I ended up with at the moment:

Dockerfile.neuron:

FROM public.ecr.aws/docker/library/ubuntu:20.04

LABEL dlc_major_version="1"
LABEL maintainer="Amazon AI"
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

# Neuron SDK components version numbers
ARG NEURON_FRAMEWORK_VERSION=1.13.1.2.11.7.0
ARG NEURON_CC_VERSION=1.24.0.0
ARG NEURONX_TOOLS_VERSION=2.19.0.0

ARG PYTHON=python3.12
ARG PYTHON_VERSION=3.12.6
ARG TORCHSERVE_VERSION=0.11.0
ARG SM_TOOLKIT_VERSION=2.0.21
ARG MAMBA_VERSION=24.7.1-2

# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
ENV LD_LIBRARY_PATH=/lib/x86_64-linux-gnu:/opt/conda/lib/:$LD_LIBRARY_PATH
ENV PATH=/opt/conda/bin:/opt/aws/neuron/bin:$PATH
ENV SAGEMAKER_SERVING_MODULE=sagemaker_pytorch_serving_container.serving:main
ENV TEMP=/home/model-server/tmp

RUN apt-get update \
 && apt-get upgrade -y \
 && apt-get install -y --no-install-recommends software-properties-common \
 && add-apt-repository ppa:openjdk-r/ppa \
 && apt-get update \
 && apt-get install -y --no-install-recommends \
    build-essential \
    apt-transport-https \
    ca-certificates \
    cmake \
    curl \
    emacs \
    git \
    jq \
    libgl1-mesa-glx \
    libglib2.0-0 \
    libsm6 \
    libxext6 \
    libxrender-dev \
    openjdk-11-jdk \
    vim \
    wget \
    unzip \
    zlib1g-dev \
    libcap-dev \
    gpg-agent \
 && rm -rf /var/lib/apt/lists/* \
 && rm -rf /tmp/tmp* \
 && apt-get clean

RUN echo "deb https://apt.repos.neuron.amazonaws.com focal main" > /etc/apt/sources.list.d/neuron.list
RUN wget -qO - https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | apt-key add -

RUN apt-get update \
 && apt-get install -y aws-neuronx-tools=$NEURONX_TOOLS_VERSION \
 && rm -rf /var/lib/apt/lists/* \
 && rm -rf /tmp/tmp* \
 && apt-get clean

# https://github.com/docker-library/openjdk/issues/261 https://github.com/docker-library/openjdk/pull/263/files
RUN keytool -importkeystore -srckeystore /etc/ssl/certs/java/cacerts -destkeystore /etc/ssl/certs/java/cacerts.jks -deststoretype JKS -srcstorepass changeit -deststorepass changeit -noprompt; \
    mv /etc/ssl/certs/java/cacerts.jks /etc/ssl/certs/java/cacerts; \
    /var/lib/dpkg/info/ca-certificates-java.postinst configure;

RUN curl -L -o ~/mambaforge.sh https://github.com/conda-forge/miniforge/releases/download/${MAMBA_VERSION}/Mambaforge-${MAMBA_VERSION}-Linux-x86_64.sh \
 && chmod +x ~/mambaforge.sh \
 && ~/mambaforge.sh -b -p /opt/conda \
 && rm ~/mambaforge.sh \
 && /opt/conda/bin/conda update -y conda \
 && /opt/conda/bin/conda install -c conda-forge -y \
    python=$PYTHON_VERSION \
    pyopenssl \
    cython \
    mkl-include \
    mkl \
    parso \
    typing \
    # Below 2 are included in miniconda base, but not mamba so need to install
    conda-content-trust \
    charset-normalizer \
 && /opt/conda/bin/conda clean -ya

RUN conda install -c conda-forge \
    scikit-learn \
    h5py \
    requests \
 && conda clean -ya \
 && pip install --upgrade pip --trusted-host pypi.org --trusted-host files.pythonhosted.org \
 && ln -s /opt/conda/bin/pip /usr/local/bin/pip3 \
 && pip install packaging \
    enum-compat \
    ipython

RUN pip install --no-cache-dir -U \
    opencv-python \
    "numpy>=1.12.6,<1.27.0" \
    "scipy>=1.11.0" \
    six \
    "pillow>=10.0.1" \
    "awscli<2" \
    pandas==1.* \
    boto3 \
    cryptography

RUN pip install neuron-cc --extra-index-url https://pip.repos.neuron.amazonaws.com \
    torch-neuron \
 && pip install -U protobuf \
    torchserve==${TORCHSERVE_VERSION} \
    torch-model-archiver==${TORCHSERVE_VERSION} \
 && pip install --no-deps --no-cache-dir -U torchvision

RUN useradd -m model-server \
 && mkdir -p /home/model-server/tmp /opt/ml/model \
 && chown -R model-server /home/model-server /opt/ml/model

COPY neuron-entrypoint.py /usr/local/bin/dockerd-entrypoint.py
COPY neuron-monitor.sh /usr/local/bin/neuron-monitor.sh
COPY torchserve-neuron.sh /usr/local/bin/entrypoint.sh
COPY config.properties /home/model-server

RUN chmod +x /usr/local/bin/dockerd-entrypoint.py \
 && chmod +x /usr/local/bin/neuron-monitor.sh \
 && chmod +x /usr/local/bin/entrypoint.sh

ADD https://raw.githubusercontent.com/aws/deep-learning-containers/master/src/deep_learning_container.py /usr/local/bin/deep_learning_container.py

RUN chmod +x /usr/local/bin/deep_learning_container.py

RUN pip install --no-cache-dir "sagemaker-pytorch-inference"

RUN HOME_DIR=/root \
 && curl -o ${HOME_DIR}/oss_compliance.zip https://aws-dlinfra-utilities.s3.amazonaws.com/oss_compliance.zip \
 && unzip ${HOME_DIR}/oss_compliance.zip -d ${HOME_DIR}/ \
 && cp ${HOME_DIR}/oss_compliance/test/testOSSCompliance /usr/local/bin/testOSSCompliance \
 && chmod +x /usr/local/bin/testOSSCompliance \
 && chmod +x ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh \
 && ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} ${PYTHON} \
 && rm -rf ${HOME_DIR}/oss_compliance* \
 # conda leaves an empty /root/.cache/conda/notices.cache file which is not removed by conda clean -ya
 && rm -rf ${HOME_DIR}/.cache/conda

RUN curl https://aws-dlc-licenses.s3.amazonaws.com/pytorch-1.13/license.txt -o /license.txt

EXPOSE 8080 8081

ENTRYPOINT ["python", "/usr/local/bin/dockerd-entrypoint.py"]
CMD ["/usr/local/bin/entrypoint.sh"]
root@099ee3c2db0a:/# pip list
Package                     Version
--------------------------- -----------
archspec                    0.2.3
asttokens                   2.4.1
awscli                      1.35.9
boltons                     24.0.0
boto3                       1.35.43
botocore                    1.35.43
Brotli                      1.1.0
cached-property             1.5.2
certifi                     2024.8.30
cffi                        1.17.1
charset-normalizer          3.4.0
colorama                    0.4.6
conda                       24.9.2
conda-content-trust         0.2.0
conda-libmamba-solver       24.7.0
conda-package-handling      2.3.0
conda_package_streaming     0.10.0
cryptography                43.0.1
Cython                      3.0.11
decorator                   5.1.1
distro                      1.9.0
docutils                    0.16
enum-compat                 0.0.3
executing                   2.1.0
frozendict                  2.4.4
h2                          4.1.0
h5py                        3.12.1
hpack                       4.0.0
hyperframe                  6.0.1
idna                        3.10
ipython                     8.28.0
jedi                        0.19.1
jmespath                    1.0.1
joblib                      1.4.2
jsonpatch                   1.33
jsonpointer                 3.0.0
libmambapy                  1.5.9
mamba                       1.5.9
matplotlib-inline           0.1.7
menuinst                    2.1.2
neuron-cc                   1.0.post1
numpy                       1.26.4
opencv-python               4.10.0.84
packaging                   24.1
pandas                      1.5.3
parso                       0.8.4
pexpect                     4.9.0
pillow                      11.0.0
pip                         24.2
platformdirs                4.3.6
pluggy                      1.5.0
prompt_toolkit              3.0.48
protobuf                    5.28.2
psutil                      6.1.0
ptyprocess                  0.7.0
pure_eval                   0.2.3
pyasn1                      0.6.1
pycosat                     0.6.6
pycparser                   2.22
Pygments                    2.18.0
pyOpenSSL                   24.2.1
PySocks                     1.7.1
python-dateutil             2.9.0.post0
pytz                        2024.2
PyYAML                      6.0.2
requests                    2.32.3
retrying                    1.3.4
rsa                         4.7.2
ruamel.yaml                 0.18.6
ruamel.yaml.clib            0.2.8
s3transfer                  0.10.3
sagemaker_pytorch_inference 2.0.25
scikit-learn                1.5.2
scipy                       1.14.1
setuptools                  74.1.2
six                         1.16.0
stack-data                  0.6.3
threadpoolctl               3.5.0
torch-model-archiver        0.11.0
torch-neuron                1.0.1522.0
torch-neuron-base           1.0
torchserve                  0.11.0
torchvision                 0.20.0
tqdm                        4.66.5
traitlets                   5.14.3
truststore                  0.9.2
urllib3                     2.2.3
wcwidth                     0.2.13
wheel                       0.44.0
zstandard                   0.23.0