Closed Petopp closed 1 year ago
Hej @Petopp
you are right. Nvidia has changed their Docker containers. I am not 100% sure if this fits their documentation, anyway I have fixed it like this:
# Copyright (c) 2020-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
FROM docker.io/arm64v8/ubuntu:18.04
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -qq -y --no-install-recommends \
bc \
bzip2 \
language-pack-en-base \
python3-distutils && \
rm -rf /var/lib/apt/lists/* && apt-get clean
ARG JETPACK_VERSION_BASE="r32.4"
ARG JETPACK_VERSION="${JETPACK_VERSION_BASE}.3"
ARG BASE_IMAGE="nvcr.io/nvidia/l4t-base:${JETPACK_VERSION}"
ARG SOC="t186"
ADD --chown=root:root https://repo.download.nvidia.com/jetson/jetson-ota-public.asc /etc/apt/trusted.gpg.d/jetson-ota-public.asc
RUN chmod 644 /etc/apt/trusted.gpg.d/jetson-ota-public.asc \
&& apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
&& echo "deb https://repo.download.nvidia.com/jetson/common ${JETPACK_VERSION_BASE} main" > /etc/apt/sources.list.d/nvidia-l4t-apt-source.list \
&& echo "deb https://repo.download.nvidia.com/jetson/${SOC} ${JETPACK_VERSION_BASE} main" >> /etc/apt/sources.list.d/nvidia-l4t-apt-source.list \
&& cat /etc/apt/sources.list.d/nvidia-l4t-apt-source.list \
&& apt-get update \
&& rm -rf /var/lib/apt/lists/*
# the last two lines are just to test it works. Leaving the ca-certificates
# package in is intentional, since nvidia uses https sources and without that,
# apt will complain about "Certificate verification failed: The certificate is
# NOT trusted. The certificate issuer is unknown. Could not handshake: Error
# in the certificate verification. [IP: 23.221.236.160 443]"
# You will probably want to update ca-certificates in each apt stanza in each
# derived image since certificates can be revoked periodically and that package
# should always be up to date.
ARG CUDA=invalid
ENV CUDA=${CUDA}
ENV PATH /usr/local/cuda-$CUDA/bin:/usr/local/cuda/bin:${PATH}
ENV LD_LIBRARY_PATH /usr/local/cuda-$CUDA/targets/aarch64-linux/lib:${LD_LIBRARY_PATH}
RUN ldconfig
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES all
CMD ["/bin/bash"]
You may have noticed I do build my own base container in the Dockerfile above. This is due to the fact I do not need any of the graphical ui libs and thus could slim down my container a lot. Basically the stuff you may need for your use-case is:
ARG JETPACK_VERSION_BASE="r32.4"
ARG JETPACK_VERSION="${JETPACK_VERSION_BASE}.3"
ARG BASE_IMAGE="nvcr.io/nvidia/l4t-base:${JETPACK_VERSION}"
ARG SOC="t186"
ADD --chown=root:root https://repo.download.nvidia.com/jetson/jetson-ota-public.asc /etc/apt/trusted.gpg.d/jetson-ota-public.asc
RUN chmod 644 /etc/apt/trusted.gpg.d/jetson-ota-public.asc \
&& apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
&& echo "deb https://repo.download.nvidia.com/jetson/common ${JETPACK_VERSION_BASE} main" > /etc/apt/sources.list.d/nvidia-l4t-apt-source.list \
&& echo "deb https://repo.download.nvidia.com/jetson/${SOC} ${JETPACK_VERSION_BASE} main" >> /etc/apt/sources.list.d/nvidia-l4t-apt-source.list \
&& cat /etc/apt/sources.list.d/nvidia-l4t-apt-source.list \
&& apt-get update \
&& rm -rf /var/lib/apt/lists/*
When it comes to building stuff in the container be warned the emulation slows things down a lot. You may also want to use multistage Docker builds to bring the image size down.
ARG O3R_BASE="o3r-l4t-base:r32.4"
FROM nexus03.dev.ifm:18443/${O3R_BASE} as base
FROM base as builder
# Install essential build packages
RUN apt-get update \
&& apt-get -y upgrade \
&& apt-get install -y -qq --no-install-recommends \
ca-certificates \
build-essential \
libnvinfer-samples \
cuda-libraries-dev-${CUDA} \
cuda-cudart-dev-${CUDA} \
cuda-compiler-${CUDA} \
&& rm -rf /var/lib/apt/lists/*
# Build the TensorRT examples
RUN cd /usr/src/tensorrt/samples && CUDA_INSTALL_DIR=/usr/local/cuda-${CUDA}/targets/aarch64-linux \
TRT_LIB_DIR=/usr/local/cuda-${CUDA}/targets/aarch64-linux/lib \
CUDNN_INSTALL_DIR=/usr/local/cuda-${CUDA}/targets/aarch64-linux \
TARGET=aarch64 \
make BUILD_TYPE="release"
FROM base AS deploy
# The small runtime image to be used
# We do copy the files to our own destination due to the fact the NVIDIA docker runtime does
# overwrite /usr/src/tensorrt in the container with the O3R directory
COPY --from=builder /usr/src/tensorrt/bin/sample_algorithm_selector /opt/ifm/tensorrt/bin/
A general remark
RUN apt-get update
RUN apt-get install -y python-pip python3-pip
RUN apt-get install -y build-essential python-dev
RUN python3 -m pip install --upgrade pip
RUN apt-get install -y libopenblas-base libopenmpi-dev libomp-dev
Grouping commands like this is none optimal in many regards, the first one is that each RUN
creates a new layer, any data produced in the layer will stick to the image. This may increase your image size significantly.
This is why I put everything in one run:
RUN apt-get update \
&& apt-get -y upgrade \
&& apt-get install -y -qq --no-install-recommends \
ca-certificates \
build-essential \
libnvinfer-samples \
cuda-libraries-dev-${CUDA} \
cuda-cudart-dev-${CUDA} \
cuda-compiler-${CUDA} \
&& rm -rf /var/lib/apt/lists/*
Having the command
apt-get update \
&& apt-get -y upgrade \
in one go also helps you to not run into caching issues. Because Docker caches layers if you don't work on your image a couple of days and change the packages you want to install the debian repository may got updated meanwhile and you will end up in fetch errors Because your local apt-cache does not match the remote one.
Hello graugans,
Thanks for the help and feedback, just testing this in the moment. I am a littel bit confused right now , the installation of ifm3dpy now always fails with this message:
ERROR: Could not find a version that satisfies the requirement imf3dpy (from versions: none).
Even in the version from yesterday (see above) it's that now. Do you have any idea what this could be?
There is a typo:
imf3dpy -> ifm3dpy
Okay, thanks...time for a holiday ;-) I'll test the other one and let you know. Hopefully without mistyping
Thanks for the help, that solved my problems so far! But I still have to optimise the storage space, at the moment I'm exceeding the limits of the system.
Thanks a lot for your help
Multistage to the rescue :-)
Hello, with the l4t (docker) version the cublas_v2.h file is not found when compiling e.g. darknet (git or pip yolo34py-gpu). Could not fix this until now, at Nvidia the solution seems to be to take a newer base version of the l4t. But this is then incompatible with imf3dpy, for example. Do you have a solution for this?
Here is what my dockerfile looks like
requirements.txt: