grimoire / amirstan_plugin

Useful tensorrt plugin. For pytorch and mmdetection model conversion.
MIT License
156 stars 38 forks source link

Build docker image with amirstan plugin on actions #15

Open mmeendez8 opened 3 years ago

mmeendez8 commented 3 years ago

Hi, this is somehow related with #14.

I am trying to build an image using the build-push action for Nvidia Jetson architecture (linux/arm/v8).

I have a very simple dockerfile:

# syntax=docker/dockerfile:experimental

FROM nvcr.io/nvidia/deepstream-l4t:5.0.1-20.09-samples as build

WORKDIR / 

RUN --mount=type=cache,id=apt-build,target=/var/cache/apt \
    apt update && apt install -y \
        git \
        wget \
        cmake \
        g++ && \
    rm -rf /var/lib/apt/lists/*

RUN git clone --depth=1 --single-branch --branch patch-1 https://github.com/mmeendez8/amirstan_plugin.git && \ 
    cd /amirstan_plugin && \ 
    git submodule update --init --progress --depth=1 && \
    mkdir build && \
    cd build && \
    cmake .. -DWITH_DEEPSTREAM=true && \
    make -j10

Which works perfectly when it is build on Jetson... But this does not happen using docker multi-arch build on Github Actions or on my local computer.

I have set up a simple pipeline which builds the image using platforms: linux/arm64/v8 (see here)

The error is related with TensorRT, which is located in /usr/include/aarch64-linux-gnu but not found by the compiler...

#11 38.05 -- Found TensorRT headers at TENSORRT_INCLUDE_DIR-NOTFOUND
#11 38.06 -- Find TensorRT libs at TENSORRT_LIBRARY_INFER-NOTFOUND;TENSORRT_LIBRARY_PARSERS-NOTFOUND;TENSORRT_LIBRARY_INFER_PLUGIN-NOTFOUND
#11 38.07 -- Could NOT find TENSORRT (missing: TENSORRT_INCLUDE_DIR TENSORRT_LIBRARY) 
#11 38.07 ERRORCannot find TensorRT library.

I have tried to compile adding the path but with no success: cmake .. -DWITH_DEEPSTREAM=true -DTENSORRT_DIR=/usr/include/aarch64-linux-gnu

Here is the complete log for the action: https://github.com/mmeendez8/amirstan_plugin/runs/2557972243?check_suite_focus=true

You can also test this on your local computer (if you have previously setup qemu) using docker buildx with the following command: docker buildx build --platform linux/arm64 -f docker/Dockerfile -t amirstan_image_jetson:latest .

grimoire commented 3 years ago

Sounds like a good way to add unit tests. Thanks, I will try this weekend.

mmeendez8 commented 3 years ago

Yes, that's exactly what I am trying to do on my repo. I will be happy to help if you need it!

mmeendez8 commented 3 years ago

This is an update for installing desired cmake version, but same fail appears:

# syntax=docker/dockerfile:experimental

FROM nvcr.io/nvidia/deepstream-l4t:5.0.1-20.09-samples as build

WORKDIR / 

RUN --mount=type=cache,id=apt-build,target=/var/cache/apt \
    apt update && apt install -y \
        build-essential \
        git \
        wget && \
    rm -rf /var/lib/apt/lists/*

RUN wget https://cmake.org/files/v3.20/cmake-3.20.2-linux-aarch64.sh && \
    mkdir /opt/cmake && \
    sh cmake-3.20.2-linux-aarch64.sh --skip-license

RUN git clone --depth=1 --single-branch --branch patch-1 https://github.com/mmeendez8/amirstan_plugin.git && \ 
    cd /amirstan_plugin && \ 
    git submodule update --init --progress --depth=1 && \
    mkdir build && \
    cd build && \
    cmake .. -DWITH_DEEPSTREAM=true -DTENSORRT_INCLUDE_DIR=/usr/include/aarch64-linux-gnu && \
    make -j10
grimoire commented 3 years ago

Hi, I do not find any TensorRT related heads or libs (NvInfer or libnvinfer ) in nvcr.io/nvidia/deepstream-l4t:5.0.1-20.09-samples

mmeendez8 commented 3 years ago

Sorry for late reply, I found this in the Nvidia documentation about the Jetson docker images:

The platform specific libraries and select device nodes for a particular device are mounted by the NVIDIA container runtime into the l4t-base container from the underlying host, thereby providing necessary dependencies for l4t applications to execute within the container. This approach enables the l4t-base container to be shared between various Jetson devices.

grimoire commented 3 years ago

Did that mean the container will use TensorRT and other Dependent libraries from the host?

mmeendez8 commented 3 years ago

That is what I understand, anyway I am not able to build the image on the Jetson device either (without cross compilation). I open a thread in the Deepstream forum too but no answer for now https://forums.developer.nvidia.com/t/build-jetson-deepstream-on-x86/177935

mmeendez8 commented 3 years ago

I just found a solution. Can't believe how much time I spend on this... But finally hit me head with this thread.

Summarizing, TensorRt is installed in the host system so you use it when you run you container by specifying --runtime=nvidia on your command. But I needed those libraries from the host on build time. The only way to solve this is by editing the docker daemon config file (/etc/docker/daemon.json) adding "default-runtime": "nvidia".

That should work but... I spend some more time figuring out why it was not working for me. It was related with Docker Buildkit which is not able to use the nvidia runtime on build stage. Once I disabled buildkit, everything was working as a charm!

I also find this repo on my way https://github.com/dusty-nv/jetson-containers which seems to have everything installed on the docker images, so it might be possible to use it to build images with cross compilation.