NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
5.14k stars 620 forks source link

wheel for aarch64 with cuda10.2? #3430

Closed twmht closed 3 years ago

twmht commented 3 years ago

Hi,

I am using jetpack 4.5 now and it uses cuda10.2.

But the prebuilt only provides cuda11 for aarch64.

Any idea?

klecki commented 3 years ago

Hi @twmht, we do not provide official DALI builds for Jetson, but you can build one on your own following this instructions: https://docs.nvidia.com/deeplearning/dali/main-user-guide/docs/compilation.html#cross-compiling-for-aarch64-jetson-linux-docker

klecki commented 3 years ago

We are currently in the process of updating the instruction, you can take a look at the: https://github.com/NVIDIA/DALI/pull/3432

twmht commented 3 years ago

it's good to remove the need of sdk manager because I am trying to use docker on windows.

twmht commented 3 years ago

@klecki

I got some errors after running code from #3422

[+] Building 485.3s (9/12)
 => [internal] load build definition from Dockerfile.build.aarch64-linux                                                                                                                                                                 
#9 0.413 Cloning into '/tmp/dali_deps'...
#9 1.194 error: pathspec 'b03d946d2a1fdd7c49bfc42d5186c42c6bcec463?' did not match any file(s) known to git.
------
executor failed running [/bin/sh -c /bin/bash -c 'DALI_DEPS_VERSION_SHA=${DALI_DEPS_VERSION_SHA:-$(cat /tmp/DALI_DEPS_VERSION)}    &&     git clone ${DALI_DEPS_REPO} /tmp/dali_deps                                                     &&
  cd /tmp/dali_deps                                                                              &&     git checkout ${DALI_DEPS_VERSION_SHA}                                                          &&     git submodule init
                                                               &&     git submodule update --depth 1 --recursive                                                     &&     export CC_COMP=aarch64-linux-gnu-gcc
                             &&     export CXX_COMP=aarch64-linux-gnu-g++                                                          &&     export INSTALL_PREFIX="/usr/aarch64-linux-gnu/"                                                &&
  export HOST_ARCH_OPTION="--host=aarch64-unknown-linux-gnu"                                     &&     export CMAKE_TARGET_ARCH=aarch64                                                               &&     export OPENCV_TOOLCHAIN_FILE="linux/aarch64-gnu.toolchain.cmake"                               &&     export WITH_FFMPEG=0                                                                           &&     /tmp/dali_deps/build_scripts/build_deps.sh && rm -rf /tmp/dali_deps && rm -rf /tmp/DALI_DEPS_VERSION']: exit code: 1

where do you define DALI_DEPS_VERSION? (https://github.com/NVIDIA/DALI/blob/main/docker/Dockerfile.build.aarch64-linux#L57)

twmht commented 3 years ago

I found out it can't find /dali/docker/build_helper.sh even I have mount the path

docker run -v "C:\Users\Preddator Triton 500\Downloads\DALI":/dali nvidia/dali:builder_aarch64-linux
/bin/sh: 1: /dali/docker/build_helper.sh: not found

I tried to check the path by interactive mode, the mount is ok.

docker run -it -v "C:\Users\Preddator Triton 500\Downloads\DALI":/dali nvidia/dali:builder_aarch64-linux bash
ls /dali/docker

any idea?

twmht commented 3 years ago

Due to the mount problem, I have tried to use interactive mode to compile Dali by running the command below CMD.

but got

/dali/third_party/libcudacxx/include/cuda/std/detail/__config:70:10: fatal error: libcxx/include/__config: No such file or directory

Any idea? log.txt

twmht commented 3 years ago

ok.

Finally I successfully built the wheel by windows docker.

for windows users, there are many ^M and ^r from the git cloned files.

so the better way is to clone the DALI source in the interactive mode. otherwise there would be many build problems.

the steps would be following after you build the image

docker -it nvidia/dali:builder_aarch64-linux bash
cd /
git clone https://github.com/NVIDIA/DALI.git --recursive
mkdir build
cd build

then run the commands under CMD

WERROR=ON           \
ARCH=aarch64-linux  \
BUILD_TEST=ON       \
BUILD_BENCHMARK=OFF \
BUILD_NVTX=OFF      \
BUILD_LMDB=ON       \
BUILD_JPEG_TURBO=ON \
BUILD_LIBTIFF=ON    \
BUILD_LIBSND=ON     \
BUILD_LIBTAR=ON     \
BUILD_FFTS=ON       \
BUILD_NVJPEG=OFF    \
BUILD_NVJPEG2K=OFF  \
BUILD_NVOF=OFF      \
BUILD_NVDEC=OFF     \
BUILD_NVML=OFF      \
VERBOSE_LOGS=OFF    \
BUILD_CUFILE=OFF    \
TEST_BUNDLED_LIBS=NO\
WHL_PLATFORM_NAME=manylinux2014_aarch64            \
BUNDLE_PATH_PREFIX="/usr/aarch64-linux-gnu"        \
EXTRA_CMAKE_OPTIONS="-DCMAKE_TOOLCHAIN_FILE:STRING=$PWD/../platforms/aarch64-linux/aarch64-linux.toolchain.cmake \
                        -DCMAKE_COLOR_MAKEFILE=ON                                 \
                        -DCMAKE_CUDA_COMPILER=/usr/local/cuda-10.2/bin/nvcc       \
                        -DCUDA_HOST=/usr/local/cuda-10.2                          \
                        -DCUDA_TARGET=/usr/local/cuda-10.2/targets/aarch64-linux" \
cd /DALI/dali_tf_plugin                         && \
bash /DALI/dali_tf_plugin/make_dali_tf_sdist.sh && \
mv /dali_tf_sdist/*.tar.gz /wheelhouse/         && \
 cp -r /wheelhouse /dali/

or avoid the characters by https://stackoverflow.com/questions/1889559/git-diff-to-ignore-m (not sure if this works)

JanuszL commented 3 years ago

Hi @twmht,

I am happy it finally worked for you. We have never tried to build DALI on Windows so we cannot guarantee it works.

twmht commented 3 years ago

@JanuszL

actually it can be built with windows docker. But there are many ops which can not support gpu mode with jetson, like decode_images or crop_mirror_normalize, I can only use cpu mode to run them. Is this normal?

JanuszL commented 3 years ago

Hi @twmht,

In the Jetson platform, there is no nvJPEG library so decode_images for mixed backend is not available, as well the Video API is different so no video reader or optical flow operator. However other operators should work, which includes crop_mirror_normalize. Can you provide the error you see when you try to use crop_mirror_normalize?