NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.55k stars 2.1k forks source link

Cuda 11.4 Dockerfile aarch64 packages not found #2257

Closed frankvp11 closed 4 months ago

frankvp11 commented 2 years ago

Description

I am trying to use the ubuntu 20.04 aarch64 dockerfile (cuda 11.4 one) it gives me the following error: E: Version '8.4.2-1+cuda11.4' for 'libnvinfer8' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvonnxparsers8' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvparsers8' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvinfer-plugin8' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvinfer-dev' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvonnxparsers-dev' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvparsers-dev' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvinfer-plugin-dev' was not found E: Version '8.4.2-1+cuda11.4' for 'python3-libnvinfer' was not found

Environment

TensorRT Version: Na NVIDIA GPU: Jetson TX2 NVIDIA Driver Version: IDK CUDA Version: na CUDNN Version: na Operating System: na Python Version (if applicable): na Tensorflow Version (if applicable): Na PyTorch Version (if applicable): Na Baremetal or Container (if so, version): Container- ubuntu20.04-aarch64

Relevant Files

The dockerfile(s)

Steps To Reproduce

sudo ./docker/build.sh --file docker/ubuntu-20.04-aarch64.Dockerfile --tag tensorrt-aarch64-ubuntu20.04-cuda11.4

Total Output: Building container: Building container:

docker build -f docker/ubuntu-20.04-aarch64.Dockerfile --build-arg CUDA_VERSION=11.6.2 --build-arg uid=0 --build-arg gid=0 --tag=tensorrt-aarch64-ubuntu20.04-cuda11.4 . Sending build context to Docker daemon 155.4MB Step 1/31 : FROM nvidia/cuda:11.4.2-devel-ubuntu20.04 ---> c42079ab752d Step 2/31 : ENV TRT_VERSION 8.4.2.4 ---> Using cache ---> dddbebc9900c Step 3/31 : SHELL ["/bin/bash", "-c"] ---> Using cache ---> fb646650b075 Step 4/31 : ARG uid=1000 ---> Using cache ---> c082a510bd4e Step 5/31 : ARG gid=1000 ---> Using cache ---> a058c6139c12 Step 6/31 : RUN groupadd -r -f -g ${gid} trtuser && useradd -o -r -l -u ${uid} -g ${gid} -ms /bin/bash trtuser ---> Using cache ---> a957ad5134da Step 7/31 : RUN usermod -aG sudo trtuser ---> Using cache ---> 23a68127f249 Step 8/31 : RUN echo 'trtuser:nvidia' | chpasswd ---> Using cache ---> 3706d4abd2af Step 9/31 : RUN mkdir -p /workspace && chown trtuser /workspace ---> Using cache ---> 9f8a86c2f72d Step 10/31 : ENV DEBIAN_FRONTEND=noninteractive ---> Using cache ---> 9e0798ae3813 Step 11/31 : RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/3bf863cc.pub ---> Using cache ---> 80eeb24939f2 Step 12/31 : RUN apt-get update && apt-get install -y software-properties-common ---> Using cache ---> 79da7f8aeb4b Step 13/31 : RUN add-apt-repository ppa:ubuntu-toolchain-r/test ---> Using cache ---> 3ff57a5f0998 Step 14/31 : RUN apt-get update && apt-get install -y --no-install-recommends libcurl4-openssl-dev wget git pkg-config sudo ssh libssl-dev pbzip2 pv bzip2 unzip devscripts lintian fakeroot dh-make build-essential ---> Using cache ---> 421f5c6d54ae Step 15/31 : RUN apt-get install -y --no-install-recommends python3 python3-pip python3-dev python3-wheel && cd /usr/local/bin && ln -s /usr/bin/python3 python && ln -s /usr/bin/pip3 pip; ---> Using cache ---> 8a4f9fcdc48d Step 16/31 : RUN v="${TRT_VERSION%.}-1+cuda${CUDA_VERSION%.}" && apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/3bf863cc.pub && apt-get update && sudo apt-get -y install libnvinfer8=${v} libnvonnxparsers8=${v} libnvparsers8=${v} libnvinfer-plugin8=${v} libnvinfer-dev=${v} libnvonnxparsers-dev=${v} libnvparsers-dev=${v} libnvinfer-plugin-dev=${v} python3-libnvinfer=${v}; ---> Running in 8b2e322cb452 Warning: apt-key output should not be parsed (stdout is not a terminal) Executing: /tmp/apt-key-gpghome.uHasvP3MDq/gpg.1.sh --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/3bf863cc.pub gpg: requesting key from 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/3bf863cc.pub' gpg: key A4B469963BF863CC: "cudatools cudatools@nvidia.com" not changed gpg: Total number processed: 1 gpg: unchanged: 1 Hit:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa InRelease Hit:2 http://ports.ubuntu.com/ubuntu-ports focal InRelease Hit:3 http://ppa.launchpad.net/ubuntu-toolchain-r/test/ubuntu focal InRelease Get:4 http://ports.ubuntu.com/ubuntu-ports focal-updates InRelease [114 kB] Hit:5 http://ports.ubuntu.com/ubuntu-ports focal-backports InRelease Hit:6 http://ports.ubuntu.com/ubuntu-ports focal-security InRelease Fetched 114 kB in 2s (46.0 kB/s) Reading package lists... Reading package lists... Building dependency tree... Reading state information... E: Version '8.4.2-1+cuda11.4' for 'libnvinfer8' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvonnxparsers8' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvparsers8' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvinfer-plugin8' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvinfer-dev' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvonnxparsers-dev' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvparsers-dev' was not found E: Version '8.4.2-1+cuda11.4' for 'libnvinfer-plugin-dev' was not found E: Version '8.4.2-1+cuda11.4' for 'python3-libnvinfer' was not found The command '/bin/bash -c v="${TRT_VERSION%.}-1+cuda${CUDA_VERSION%.}" && apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/3bf863cc.pub && apt-get update && sudo apt-get -y install libnvinfer8=${v} libnvonnxparsers8=${v} libnvparsers8=${v} libnvinfer-plugin8=${v} libnvinfer-dev=${v} libnvonnxparsers-dev=${v} libnvparsers-dev=${v} libnvinfer-plugin-dev=${v} python3-libnvinfer=${v};' returned a non-zero code: 100

frankvp11 commented 2 years ago

Sorry about changing the issue so many times- I was indecisive and was changing things from default- changed them all back and got this error - Also why is it showing up with "Misplaced &" - Thats not in my original error? Is that a bug in the dockerfile? Ive tried a multidude of the docker files and they mostly give that error something similar. I also tried going back a release and i still got that error -could it be an issue on my part?

frankvp11 commented 2 years ago

Update: I ended up getting the Dockerfile working on the latest commit of TensorRT however upon entering it and trying to run the command /usr/src/tensorrt/bin/trtexec it gives me the following error: /usr/src/tensorrt/bin/trtexec: error while loading shared libraries: libcudart.so.10.2: cannot open shared object file: No such file or directory

And upon doing sudo find . -iname libcudart.so.10.2 I get the following output ./usr/local/cuda-10.2/targets/aarch64-linux/lib/libcudart.so.10.2

Also, gentle ping: @kevinch-nv

frankvp11 commented 2 years ago

Things I have tried: adding the directory I mentioned above to path- failed tried different dockerfiles - failed reinstalling different version of pytorch - didnt do anything

frankvp11 commented 2 years ago

To give you more information, I'd like this to work so that I can build an engine on it and then run it. Ive already got the model weights and everything that I need I just need the TensorRT OSS to be upgraded.

zerollzeng commented 2 years ago

@kevinch-nv ^ ^

frankvp11 commented 2 years ago

Any updates??? @kevinch-nv

kevinch-nv commented 2 years ago

Looking into it - it seems like all we have to do is update the base image here https://github.com/NVIDIA/TensorRT/blob/main/docker/ubuntu-20.04-aarch64.Dockerfile#L19 to 11.6.2. I'm verifying it on my end - can you try to update this locally?

As an aside - I see you've filed this against Jetson TX2, for Jetson platforms you should be following the cross-compilation instructions (https://github.com/NVIDIA/TensorRT#setting-up-the-build-environment), so you should be using the ubuntu-cross-aarch64.Dockerfile

frankvp11 commented 2 years ago

Yeah- thanks for the help. I ended up getting it to work. Turns out I didn't need TensorRT 8.4.x, and all I had to do was follow instructions from https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps/tree/master/TRT-OSS/Jetson and that worked for me. However back to the dockerfile issue, I could try it later after doing some more testing with my current solution.

frankvp11 commented 2 years ago

Also the problem with the cross-aarch dockerfile for me was that it would error out when it copied the jetpack files into the pdk_files. This is because it couldn't find the jetson files

ZengyuanYu commented 2 years ago

Looking into it - it seems like all we have to do is update the base image here https://github.com/NVIDIA/TensorRT/blob/main/docker/ubuntu-20.04-aarch64.Dockerfile#L19 to 11.6.2. I'm verifying it on my end - can you try to update this locally?

As an aside - I see you've filed this against Jetson TX2, for Jetson platforms you should be following the cross-compilation instructions (https://github.com/NVIDIA/TensorRT#setting-up-the-build-environment), so you should be using the ubuntu-cross-aarch64.Dockerfile

I don't understand why must to use cross-compile. TX2/Xavier/Orin have tiny ubuntu which can use ubuntu-20.04-aarch64.Dockerfile build docker?

frankvp11 commented 2 years ago

@kevinch-nv did you end up checking it on your end- and did it work?

andronat commented 1 year ago

JFYI I'm facing the same issue. Replacing the version of cuda to 11.6.2 makes the container build to fail with:

Building container:
> docker build -f docker/ubuntu-20.04-aarch64.Dockerfile --build-arg CUDA_VERSION=11.6.2 --build-arg uid=502 --build-arg gid=80 --tag=tensorrt-aarch64-ubuntu20.04-cuda11.4 .
[+] Building 426.8s (20/25)                                                                                                                                                                         
 => [internal] load build definition from ubuntu-20.04-aarch64.Dockerfile                                                                                                                      0.0s
 => => transferring dockerfile: 3.77kB                                                                                                                                                         0.0s
 => [internal] load .dockerignore                                                                                                                                                              0.0s
 => => transferring context: 34B                                                                                                                                                               0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.6.2-devel-ubuntu20.04                                                                                                                1.2s
 => [ 1/21] FROM docker.io/nvidia/cuda:11.6.2-devel-ubuntu20.04@sha256:f9785e8241290c2947fe89eb2b8540e7777337a01c5b6340533e0b6599cdfd83                                                      221.0s
 => => resolve docker.io/nvidia/cuda:11.6.2-devel-ubuntu20.04@sha256:f9785e8241290c2947fe89eb2b8540e7777337a01c5b6340533e0b6599cdfd83                                                          0.0s
 => => sha256:ec0b93578a6e38f0044abb6bc46a5a8ee93c5dc14323d5fcf35b272417c1e2d4 186B / 186B                                                                                                     0.6s
 => => sha256:f9785e8241290c2947fe89eb2b8540e7777337a01c5b6340533e0b6599cdfd83 743B / 743B                                                                                                     0.0s
 => => sha256:3101385b820f9f55ca61e5d2aaf4e912e3c6d5ea7f967e5272a210190ac547c4 2.21kB / 2.21kB                                                                                                 0.0s
 => => sha256:f5e423043b47f33682403b04d1400171e9f7e8d384dd0d4169e3690e2b6d61e6 361.71kB / 361.71kB                                                                                             0.4s
 => => sha256:9872d9aa75f7a2d051a4d3339dcb1af3616d6484f31577983d81f051668f1fb6 13.45kB / 13.45kB                                                                                               0.0s
 => => sha256:775bcf4925a33701c1dd9b7bf6ef598a26360e1ced1479fb49cfeb70990915cf 7.76MB / 7.76MB                                                                                                 0.9s
 => => sha256:8941157b58ada869bd12299ba8a56cdc5317923a5ce7df8158c5a3b44ff2fb67 6.43kB / 6.43kB                                                                                                 0.8s
 => => sha256:ff82155d267bda15b08af95fb76ae972b1c837c3c1c703975367e4cc586e2f9e 1.12GB / 1.12GB                                                                                               177.8s
 => => sha256:b8e91a073034a8ae8c9ab24f8ad39d82b39c0570d10f555d817c3d50c984f4e2 61.66kB / 61.66kB                                                                                               1.1s
 => => extracting sha256:775bcf4925a33701c1dd9b7bf6ef598a26360e1ced1479fb49cfeb70990915cf                                                                                                      0.2s
 => => sha256:b6e9be7fe06f43c048223a8ae71c3f5664c35ddd11aa686c0173becae8fc15b1 1.41GB / 1.41GB                                                                                               193.1s
 => => sha256:5087632c6bbd71828eef5d6e1b036264c7278d5f887eecaa6e25560f067e88f9 83.98kB / 83.98kB                                                                                               1.3s
 => => extracting sha256:f5e423043b47f33682403b04d1400171e9f7e8d384dd0d4169e3690e2b6d61e6                                                                                                      0.0s
 => => extracting sha256:ec0b93578a6e38f0044abb6bc46a5a8ee93c5dc14323d5fcf35b272417c1e2d4                                                                                                      0.0s
 => => extracting sha256:8941157b58ada869bd12299ba8a56cdc5317923a5ce7df8158c5a3b44ff2fb67                                                                                                      0.0s
 => => extracting sha256:ff82155d267bda15b08af95fb76ae972b1c837c3c1c703975367e4cc586e2f9e                                                                                                     18.9s
 => => extracting sha256:b8e91a073034a8ae8c9ab24f8ad39d82b39c0570d10f555d817c3d50c984f4e2                                                                                                      0.0s
 => => extracting sha256:b6e9be7fe06f43c048223a8ae71c3f5664c35ddd11aa686c0173becae8fc15b1                                                                                                     23.9s
 => => extracting sha256:5087632c6bbd71828eef5d6e1b036264c7278d5f887eecaa6e25560f067e88f9                                                                                                      0.0s
 => [internal] load build context                                                                                                                                                              0.0s
 => => transferring context: 38B                                                                                                                                                               0.0s
 => [ 2/21] RUN groupadd -r -f -g 80 trtuser && useradd -o -r -l -u 502 -g 80 -ms /bin/bash trtuser                                                                                            2.5s
 => [ 3/21] RUN usermod -aG sudo trtuser                                                                                                                                                       0.2s
 => [ 4/21] RUN echo 'trtuser:nvidia' | chpasswd                                                                                                                                               0.2s
 => [ 5/21] RUN mkdir -p /workspace && chown trtuser /workspace                                                                                                                                0.3s
 => [ 6/21] RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/3bf863cc.pub                                                                 0.5s
 => [ 7/21] RUN apt-get update && apt-get install -y software-properties-common                                                                                                               14.1s 
 => [ 8/21] RUN add-apt-repository ppa:ubuntu-toolchain-r/test                                                                                                                                 2.6s 
 => [ 9/21] RUN apt-get update && apt-get install -y --no-install-recommends     libcurl4-openssl-dev     wget     git     pkg-config     sudo     ssh     libssl-dev     pbzip2     pv       10.7s 
 => [10/21] RUN apt-get install -y --no-install-recommends       python3       python3-pip       python3-dev       python3-wheel &&    cd /usr/local/bin &&    ln -s /usr/bin/python3 python   3.1s 
 => [11/21] RUN v="${TRT_VERSION%.*}-1+cuda${CUDA_VERSION%.*}" &&    apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/3bf863cc.pub &&    a  156.7s 
 => [12/21] RUN cd /tmp &&     wget https://github.com/Kitware/CMake/releases/download/v3.21.4/cmake-3.21.4-linux-aarch64.sh &&     chmod +x cmake-3.21.4-linux-aarch64.sh &&     ./cmake-3.2  5.1s 
 => [13/21] RUN pip3 install --upgrade pip                                                                                                                                                     1.9s 
 => [14/21] RUN pip3 install setuptools>=41.0.0                                                                                                                                                0.7s
 => [15/21] COPY requirements.txt /tmp/requirements.txt                                                                                                                                        0.0s
 => ERROR [16/21] RUN pip3 install -r /tmp/requirements.txt                                                                                                                                    5.8s
------
 > [16/21] RUN pip3 install -r /tmp/requirements.txt:
#20 0.520 Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
#20 0.520 Looking in links: https://download.pytorch.org/whl/cu113/torch_stable.html
#20 0.520 Ignoring onnx: markers 'python_version == "3.10"' don't match your environment
#20 0.520 Ignoring tensorflow-gpu: markers 'platform_machine == "x86_64" and sys_platform == "linux"' don't match your environment
#20 0.520 Ignoring onnxruntime: markers 'python_version == "3.10"' don't match your environment
#20 0.520 Ignoring torch: markers 'python_version == "3.10"' don't match your environment
#20 0.521 Ignoring torchvision: markers 'python_version == "3.10"' don't match your environment
#20 1.484 Collecting onnx==1.10.2
#20 2.265   Downloading onnx-1.10.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (12.7 MB)
#20 3.056      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.7/12.7 MB 15.9 MB/s eta 0:00:00
#20 3.995 Collecting onnxruntime==1.8.1
#20 4.339   Downloading onnxruntime-1.8.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.4 MB)
#20 4.857      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.4/7.4 MB 14.3 MB/s eta 0:00:00
#20 5.698 ERROR: Could not find a version that satisfies the requirement torch==1.10.2+cu113 (from versions: 1.8.0, 1.8.1, 1.9.0, 1.10.0, 1.10.1, 1.10.2, 1.11.0, 1.12.0, 1.12.1, 1.13.0)
#20 5.698 ERROR: No matching distribution found for torch==1.10.2+cu113
------
executor failed running [/bin/bash -c pip3 install -r /tmp/requirements.txt]: exit code: 1
RajUpadhyay commented 1 year ago

Hi, if anyone is still facing this issue, there is one more thing you should try.

Just pull the following dockerfile and work with tensorrt like you normally would.

https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-pytorch

This way atleast you won't have to worry about the torch, torchvision and the tensorrt's version.

ttyio commented 4 months ago

I will close inactive issues for more than 3 week per our policy, thanks all!