spiderdab commented 4 years ago

Environment: Pop-Os (Ubuntu 20.04) Gcc 9 and 8 (tried both) Cmake 3.16.3 dlib 19.20.99 python 3.8.2 NVIDIA Quadro P600

Expected Behavior

Compiling with cuda..

Current Behavior

..... -- Found cuDNN: /usr/lib/x86_64-linux-gnu/libcudnn.so -- Building a CUDA test project to see if your compiler is compatible with CUDA... -- Checking if you have the right version of cuDNN installed. -- Found cuDNN, but it looks like the wrong version so dlib will not use it. -- *** Dlib requires cuDNN V5.0 OR GREATER. Since cuDNN is not found DLIB WILL NOT USE CUDA. .....

Where did you get dlib: git clone https://github.com/davisking/dlib.git

I first tried downloading the latest (and suggested) version of cuda from Nvidia site, which was Cuda 11. then downloaded cuDNN for Cuda 11, which was 8.0.0 (runtime and dev .deb packets) installed them following Nvidia method.

when compiling "cudnn_samples_v8" everything works, so I think installations went ok. but no way to get dlib compiled with Cuda.

I've tried to uninstall cuda 11 and cudnn 8 and install cuda 10.1 and cudnn 7.6 (suggested for cuda 10.1) but the result is the same. Every time I erased the build directory, and compiled in 2 ways: as per dlib instructions, using: $ sudo python3 setup.py install or: $ cmake .. -DDLIB_USE_CUDA=1 -DUSE_AVX_INSTRUCTIONS=1

Any suggestion? Thanks.

davisking commented 4 years ago

That test for cudnn is done by building the dlib/cmake_utils/test_for_cudnn cmake project and checking if the build succeeds. So if you build dlib/cmake_utils/test_for_cudnn specifically you can see why it failed. I will note that CUDA 11 is still a release candidate, and Ubuntu 20.04 is not an officially supported distribution by NVIDIA, so that might be why it's not building. Most of the times I try to get CUDA working on a non-supported distribution it's an exercise in frustration.

On Tue, Jun 9, 2020 at 3:28 PM spiderdab notifications@github.com wrote:

Environment: Pop-Os (Ubuntu 20.04) Gcc 9 and 8 (tried both) Cmake 3.16.3 dlib 19.20.99 python 3.8.2 NVIDIA Quadro P600 Expected Behavior

Compiling with cuda.. Current Behavior

..... -- Found cuDNN: /usr/lib/x86_64-linux-gnu/libcudnn.so -- Building a CUDA test project to see if your compiler is compatible with CUDA... -- Checking if you have the right version of cuDNN installed. -- Found cuDNN, but it looks like the wrong version so dlib will not use it. -- *** Dlib requires cuDNN V5.0 OR GREATER. Since cuDNN is not found DLIB WILL NOT USE CUDA. .....

Where did you get dlib: git clone https://github.com/davisking/dlib.git

I first tried downloading the latest (and suggested) version of cuda from Nvidia site, which was Cuda 11. then downloaded cuDNN for Cuda 11, which was 8.0.0 (runtime and dev .deb packets) installed them following Nvidia method.

when compiling "cudnn_samples_v8" everything works, so I think installations went ok. but no way to get dlib compiled with Cuda.

I've tried to uninstall cuda 11 and cudnn 8 and install cuda 10.1 and cudnn 7.6 (suggested for cuda 10.1) but the result is the same. Every time I erased the build directory, and compiled in 2 ways: as per dlib instructions, using: $ sudo python3 setup.py install or: $ cmake .. -DDLIB_USE_CUDA=1 -DUSE_AVX_INSTRUCTIONS=1

Any suggestion? Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/davisking/dlib/issues/2100, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPYFR2QMY577NZ6TQEESC3RV2EOLANCNFSM4NZWP7FA .

spiderdab commented 4 years ago

Thanks for your time. Trying to compile "test_for_cudnn" I have this result:

dab@pop-os:/home/dab/dlib/dlib/cmake_utils/test_for_cudnn/build$ cmake .. -- The C compiler identification is GNU 8.4.0 -- The CXX compiler identification is GNU 8.4.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Looking for pthread.h -- Looking for pthread.h - found -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found suitable version "10.1", minimum required is "7.5") -- Looking for cuDNN install... -- Found cuDNN: /usr/lib/x86_64-linux-gnu/libcudnn.so -- C++11 activated. -- Configuring done -- Generating done -- Build files have been written to: /home/dab/dlib/dlib/cmake_utils/test_for_cudnn/build dab@pop-os:/home/dab/dlib/dlib/cmake_utils/test_for_cudnn/build$ ls CMakeCache.txt CMakeFiles cmake_install.cmake Makefile dab@pop-os:~/dlib/dlib/cmake_utils/test_for_cudnn/build$ make Scanning dependencies of target cudnn_test [ 50%] Building CXX object CMakeFiles/cudnn_test.dir/home/dab/dlib/dlib/cuda/cudnn_dlibapi.cpp.o In file included from /home/dab/dlib/dlib/cuda/cudnn_dlibapi.cpp:10: /usr/local/include/cudnn.h:59:10: fatal error: cudnn_version.h: File o directory non esistente

include "cudnn_version.h"

      ^~~~~~~~~~~~~~~~~

compilation terminated. make[2]: [CMakeFiles/cudnn_test.dir/build.make:63: CMakeFiles/cudnn_test.dir/home/dab/dlib/dlib/cuda/cudnn_dlibapi.cpp.o] Error 1 make[1]: [CMakeFiles/Makefile2:76: CMakeFiles/cudnn_test.dir/all] Error 2 make: *** [Makefile:84: all] Error 2

should I find "cudnn_version.h" and copy in ".../dlib/cuda/" folder?

davisking commented 4 years ago

No, don't copy any cudnn files into the dlib folder Installing cudnn should put that header file in a standard place the compiler can find it. Something is wrong with your install if gcc can't find that header.

spiderdab commented 4 years ago

I've found that header here: dab@pop-os:~$ sudo locate cudnn_version.h [sudo] password di dab: /usr/include/cudnn_version.h

it looks like a common folder.. Can you point me on how to add that path to cmake, or what else I can do?

thanks.

spiderdab commented 4 years ago

I could finally install dlib with cuda and cudnn into Pop!_OS 20.04 (ubuntu) using cuda 10.1 and dlib 7.6.4 into the Nvidia download page you can download 3 .deb packets but they are not enough, since as written above the correct headers are not found. so I also downloaded one archive which name is simply "cuDNN Library for Linux" and copied 2 folders (include and libx86) from this archive to cuda folder using these 2 commands: `dab@pop-os:~/cudnn-10.1-linux-x64-v7.6.4.38/cuda$ sudo cp -r lib64/ /usr/local/cuda-10.1/targets/x86_64-linux/lib/

dab@pop-os:~/cudnn-10.1-linux-x64-v7.6.4.38/cuda$ sudo cp -r include/* /usr/local/cuda-10.1/targets/x86_64-linux/include/ ` after that DLIB compilation went all as expected.

Thanks, Davide.

davisking commented 4 years ago

Huh, you should definitely not need to do this stuff. But like I was saying, nvidia doens't support cuda on that version of ubuntu yet, and maybe cmake isn't updated to try and make it work either. Inside dlib/CMakeLists.txt you can find the statements that find cuda. You could add to the paths there and maybe that would help though.

spiderdab commented 4 years ago

I've tried that before, but didn't work. (for sure my fault..) Thank you very much for your time and help.

alexmarkley commented 4 years ago

I'm also having a similar (but not identical issue), apparently related to some API changes in cuDNN.

alex@ubsidian:~/Build/dlib/dlib_git/dlib/cmake_utils/test_for_cudnn/build$ cmake ..
-- The C compiler identification is GNU 9.3.0
-- The CXX compiler identification is GNU 9.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found CUDA: /usr/local/cuda (found suitable version "11.0", minimum required is "7.5") 
-- Looking for cuDNN install...
-- Found cuDNN: /usr/lib/x86_64-linux-gnu/libcudnn.so
-- C++11 activated.
-- Configuring done
-- Generating done
-- Build files have been written to: /home/alex/Build/dlib/dlib_git/dlib/cmake_utils/test_for_cudnn/build
alex@ubsidian:~/Build/dlib/dlib_git/dlib/cmake_utils/test_for_cudnn/build$ make
Scanning dependencies of target cudnn_test
[ 50%] Building CXX object CMakeFiles/cudnn_test.dir/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp.o
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp: In member function ‘void dlib::cuda::tensor_conv::setup(const dlib::tensor&, const dlib::tensor&, int, int, int, int)’:
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:850:57: error: ‘CUDNN_CONVOLUTION_FWD_PREFER_FASTEST’ was not declared in this scope; did you mean ‘CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3’?
  850 |                         dnn_prefer_fastest_algorithms()?CUDNN_CONVOLUTION_FWD_PREFER_FASTEST:CUDNN_CONVOLUTION_FWD_NO_WORKSPACE,
      |                                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:43:33: note: in definition of macro ‘CHECK_CUDNN’
   43 |     const cudnnStatus_t error = call;                                         \
      |                                 ^~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:850:94: error: ‘CUDNN_CONVOLUTION_FWD_NO_WORKSPACE’ was not declared in this scope; did you mean ‘CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD’?
  850 |                         dnn_prefer_fastest_algorithms()?CUDNN_CONVOLUTION_FWD_PREFER_FASTEST:CUDNN_CONVOLUTION_FWD_NO_WORKSPACE,
      |                                                                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:43:33: note: in definition of macro ‘CHECK_CUDNN’
   43 |     const cudnnStatus_t error = call;                                         \
      |                                 ^~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:844:29: error: ‘cudnnGetConvolutionForwardAlgorithm’ was not declared in this scope; did you mean ‘cudnnGetConvolutionForwardAlgorithm_v7’?
  844 |                 CHECK_CUDNN(cudnnGetConvolutionForwardAlgorithm(
      |                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:43:33: note: in definition of macro ‘CHECK_CUDNN’
   43 |     const cudnnStatus_t error = call;                                         \
      |                                 ^~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:872:57: error: ‘CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST’ was not declared in this scope; did you mean ‘CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT’?
  872 |                         dnn_prefer_fastest_algorithms()?CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST:CUDNN_CONVOLUTION_BWD_DATA_NO_WORKSPACE,
      |                                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:43:33: note: in definition of macro ‘CHECK_CUDNN’
   43 |     const cudnnStatus_t error = call;                                         \
      |                                 ^~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:872:99: error: ‘CUDNN_CONVOLUTION_BWD_DATA_NO_WORKSPACE’ was not declared in this scope; did you mean ‘CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD’?
  872 |                         dnn_prefer_fastest_algorithms()?CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST:CUDNN_CONVOLUTION_BWD_DATA_NO_WORKSPACE,
      |                                                                                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:43:33: note: in definition of macro ‘CHECK_CUDNN’
   43 |     const cudnnStatus_t error = call;                                         \
      |                                 ^~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:866:29: error: ‘cudnnGetConvolutionBackwardDataAlgorithm’ was not declared in this scope; did you mean ‘cudnnGetConvolutionBackwardDataAlgorithm_v7’?
  866 |                 CHECK_CUDNN(cudnnGetConvolutionBackwardDataAlgorithm(
      |                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:43:33: note: in definition of macro ‘CHECK_CUDNN’
   43 |     const cudnnStatus_t error = call;                                         \
      |                                 ^~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:895:57: error: ‘CUDNN_CONVOLUTION_BWD_FILTER_PREFER_FASTEST’ was not declared in this scope; did you mean ‘CUDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT’?
  895 |                         dnn_prefer_fastest_algorithms()?CUDNN_CONVOLUTION_BWD_FILTER_PREFER_FASTEST:CUDNN_CONVOLUTION_BWD_FILTER_NO_WORKSPACE,
      |                                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:43:33: note: in definition of macro ‘CHECK_CUDNN’
   43 |     const cudnnStatus_t error = call;                                         \
      |                                 ^~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:895:101: error: ‘CUDNN_CONVOLUTION_BWD_FILTER_NO_WORKSPACE’ was not declared in this scope; did you mean ‘CUDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD’?
  895 |                         dnn_prefer_fastest_algorithms()?CUDNN_CONVOLUTION_BWD_FILTER_PREFER_FASTEST:CUDNN_CONVOLUTION_BWD_FILTER_NO_WORKSPACE,
      |                                                                                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:43:33: note: in definition of macro ‘CHECK_CUDNN’
   43 |     const cudnnStatus_t error = call;                                         \
      |                                 ^~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:889:29: error: ‘cudnnGetConvolutionBackwardFilterAlgorithm’ was not declared in this scope; did you mean ‘cudnnGetConvolutionBackwardFilterAlgorithm_v7’?
  889 |                 CHECK_CUDNN(cudnnGetConvolutionBackwardFilterAlgorithm(
      |                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp:43:33: note: in definition of macro ‘CHECK_CUDNN’
   43 |     const cudnnStatus_t error = call;                                         \
      |                                 ^~~~
make[2]: *** [CMakeFiles/cudnn_test.dir/build.make:63: CMakeFiles/cudnn_test.dir/home/alex/Build/dlib/dlib_git/dlib/cuda/cudnn_dlibapi.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:76: CMakeFiles/cudnn_test.dir/all] Error 2
make: *** [Makefile:84: all] Error 2
alex@ubsidian:~/Build/dlib/dlib_git/dlib/cmake_utils/test_for_cudnn/build$

Here's some information about my system:

alex@ubsidian:~$ uname -a
Linux ubsidian.malexmedia.net 5.4.0-37-generic #41-Ubuntu SMP Wed Jun 3 18:57:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
alex@ubsidian:~$ 
alex@ubsidian:~$ cat /etc/os-release 
NAME="Ubuntu"
VERSION="20.04 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
alex@ubsidian:~$ 
alex@ubsidian:~$ dpkg -l | grep -i cuda
ii  cuda                                       11.0.1-1                              amd64        CUDA meta-package
ii  cuda-11-0                                  11.0.1-1                              amd64        CUDA 11.0 meta-package
ii  cuda-command-line-tools-11-0               11.0.1-1                              amd64        CUDA command-line tools
ii  cuda-compiler-11-0                         11.0.1-1                              amd64        CUDA compiler
ii  cuda-cudart-11-0                           11.0.171-1                            amd64        CUDA Runtime native Libraries
ii  cuda-cudart-dev-11-0                       11.0.171-1                            amd64        CUDA Runtime native dev links, headers
ii  cuda-cuobjdump-11-0                        11.0.167-1                            amd64        CUDA cuobjdump
ii  cuda-cupti-11-0                            11.0.167-1                            amd64        CUDA profiling tools runtime libs.
ii  cuda-cupti-dev-11-0                        11.0.167-1                            amd64        CUDA profiling tools interface.
ii  cuda-demo-suite-11-0                       11.0.167-1                            amd64        Demo suite for CUDA
ii  cuda-documentation-11-0                    11.0.182-1                            amd64        CUDA documentation
ii  cuda-driver-dev-11-0                       11.0.171-1                            amd64        CUDA Driver native dev stub library
ii  cuda-drivers                               450.36.06-1                           amd64        CUDA Driver meta-package, branch-agnostic
ii  cuda-drivers-450                           450.36.06-1                           amd64        CUDA Driver meta-package, branch-specific
ii  cuda-gdb-11-0                              11.0.172-1                            amd64        CUDA-GDB
ii  cuda-libraries-11-0                        11.0.1-1                              amd64        CUDA Libraries 11.0 meta-package
ii  cuda-libraries-dev-11-0                    11.0.1-1                              amd64        CUDA Libraries 11.0 development meta-package
ii  cuda-memcheck-11-0                         11.0.167-1                            amd64        CUDA-MEMCHECK
ii  cuda-nsight-11-0                           11.0.167-1                            amd64        CUDA nsight
ii  cuda-nsight-compute-11-0                   11.0.1-1                              amd64        NVIDIA Nsight Compute
ii  cuda-nsight-systems-11-0                   11.0.1-1                              amd64        NVIDIA Nsight Systems
ii  cuda-nvcc-11-0                             11.0.167-1                            amd64        CUDA nvcc
ii  cuda-nvdisasm-11-0                         11.0.167-1                            amd64        CUDA disassembler
ii  cuda-nvml-dev-11-0                         11.0.167-1                            amd64        NVML native dev links, headers
ii  cuda-nvprof-11-0                           11.0.167-1                            amd64        CUDA Profiler tools
ii  cuda-nvprune-11-0                          11.0.167-1                            amd64        CUDA nvprune
ii  cuda-nvrtc-11-0                            11.0.167-1                            amd64        NVRTC native runtime libraries
ii  cuda-nvrtc-dev-11-0                        11.0.167-1                            amd64        NVRTC native dev links, headers
ii  cuda-nvtx-11-0                             11.0.167-1                            amd64        NVIDIA Tools Extension
ii  cuda-nvvp-11-0                             11.0.167-1                            amd64        CUDA Profiler tools
ii  cuda-runtime-11-0                          11.0.1-1                              amd64        CUDA Runtime 11.0 meta-package
ii  cuda-samples-11-0                          11.0.167-1                            amd64        CUDA example applications
ii  cuda-sanitizer-11-0                        11.0.167-1                            amd64        CUDA Sanitizer
ii  cuda-toolkit-11-0                          11.0.1-1                              amd64        CUDA Toolkit 11.0 meta-package
ii  cuda-tools-11-0                            11.0.1-1                              amd64        CUDA Tools meta-package
ii  cuda-visual-tools-11-0                     11.0.1-1                              amd64        CUDA visual tools
ii  libcudnn8                                  8.0.0.180-1+cuda11.0                  amd64        cuDNN runtime libraries
ii  libcudnn8-dev                              8.0.0.180-1+cuda11.0                  amd64        cuDNN development libraries and headers
ii  libcusolver-11-0                           10.4.0.191-1                          amd64        CUDA solver native runtime libraries
ii  libcusolver-dev-11-0                       10.4.0.191-1                          amd64        CUDA solver native dev links, headers
alex@ubsidian:~$

Note that, although Cuda 11 is a release candidate, and Ubuntu 20.04 is not technically supported, I have not had any other issues building Cuda projects with nvcc and Cuda is definitely working.

alexmarkley commented 4 years ago

Also of note, there appears to be no 7.6.x build of cuDNN for CUDA 11.

rafale77 commented 4 years ago

exact same problem here as @alexmarkley on ubuntu 18.04LTS...

davisking commented 4 years ago

Pull the latest dlib from github and give it a try. I just updated it to support cuDNN 8.0.

rafale77 commented 4 years ago

Just tested... almost there: It found it but somehow still says it won't use CUDA.

-- Using CMake version: 3.10.2
-- Compiling dlib version: 19.20.99
-- Enabling AVX instructions
-- Found system copy of libpng: /usr/lib/x86_64-linux-gnu/libpng.so;/usr/lib/x86_64-linux-gnu/libz.so
-- Found system copy of libjpeg: /usr/lib/x86_64-linux-gnu/libjpeg.so
-- Searching for BLAS and LAPACK
-- Searching for BLAS and LAPACK
-- Checking for module 'cblas'
--   No package 'cblas' found
-- Found Intel MKL BLAS/LAPACK library
-- Looking for cuDNN install...
-- Found cuDNN: /usr/lib/x86_64-linux-gnu/libcudnn.so
-- Disabling CUDA support for dlib.  DLIB WILL NOT USE CUDA
-- C++11 activated.
-- Configuring done
-- Generating done
-- Build files have been written to: /home/anhman/source/dlib/build

davisking commented 4 years ago

@rafale77 Are you building from a clean build folder?

davisking commented 4 years ago

Once cmake has detected cuDNN doesn't build, it won't check again. You have to clean out the build folder.

rafale77 commented 4 years ago

Thanks! My bad...

Took me some time but I successfully finished compiling, installing the python API and am running it on home assistant now. The cnn model is indeed much faster. Watching a single stream from my doorbell consumes 2.5GB of memory though. I did not know this would consume this much GPU memory. I wonder if I should have picked a 1080Ti instead of a 2070...

sergiogut1805 commented 4 years ago

I have the same issue in windows 10.

Using: Dlib 19.20 CUDA 11 final (Not RC) Cudnn 8 Visual Studio 2019 (latest release 16.6.3)

Dlib compile OK with Cuda 10.2 and Cudnn 7.6.5 but not with Cuda 11.

davisking commented 4 years ago

Try using the latest dlib from github.

On Jul 9, 2020, at 3:27 PM, sergiogut1805 notifications@github.com wrote:

I have the same issue in windows 10.

Using: Dlib 19.20 CUDA 11 final (Not RC) Cudnn 8 Visual Studio 2019 (latest release 16.6.3)

Dlib compile OK with Cuda 10.2 and Cudnn 7.6.5 but not with Cuda 11.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

sergiogut1805 commented 4 years ago

Try using the latest dlib from github. … On Jul 9, 2020, at 3:27 PM, sergiogut1805 @.***> wrote: I have the same issue in windows 10. Using: Dlib 19.20 CUDA 11 final (Not RC) Cudnn 8 Visual Studio 2019 (latest release 16.6.3) Dlib compile OK with Cuda 10.2 and Cudnn 7.6.5 but not with Cuda 11. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Nothing Davis :(

Selecting Windows SDK version 10.0.17763.0 to target Windows 10.0.18363. The C compiler identification is MSVC 19.26.28806.0 The CXX compiler identification is MSVC 19.26.28806.0 Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.26.28801/bin/Hostx64/x64/cl.exe Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.26.28801/bin/Hostx64/x64/cl.exe - works Detecting C compiler ABI info Detecting C compiler ABI info - done Detecting C compile features Detecting C compile features - done Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.26.28801/bin/Hostx64/x64/cl.exe Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.26.28801/bin/Hostx64/x64/cl.exe - works Detecting CXX compiler ABI info Detecting CXX compiler ABI info - done Detecting CXX compile features Detecting CXX compile features - done Using CMake version: 3.17.3 Compiling dlib version: 19.20.99 Looking for sys/types.h Looking for sys/types.h - found Looking for stdint.h Looking for stdint.h - found Looking for stddef.h Looking for stddef.h - found Check size of void Check size of void - done Enabling SSE2 instructions Searching for BLAS and LAPACK Searching for BLAS and LAPACK Looking for pthread.h Looking for pthread.h - not found Found Threads: TRUE
Found CUDA: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.0 (found suitable version "11.0", minimum required is "7.5") Looking for cuDNN install... Found cuDNN: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.0/lib/x64/cudnn.lib Building a CUDA test project to see if your compiler is compatible with CUDA...

CUDA was found but your compiler failed to compile a simple CUDA program so dlib isn't going to use CUDA. The output of the failed CUDA test compile is shown below:

*** Change Dir: E:/MLCNN/Dlib Compiled/dlib_build/cuda_test_build

Run Build Command(s):C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/MSBuild/Current/Bin/MSBuild.exe ALL_BUILD.vcxproj /p:Configuration=Debug /p:Platform=x64 /p:VisualStudioVersion=16.0 /v:m && Microsoft (R) Build Engine versi¢n 16.6.0+5ff7b0c9e para .NET Framework Copyright (C) Microsoft Corporation. Todos los derechos reservados.

Checking Build System Building NVCC (Device) object CMakeFiles/cuda_test.dir/Debug/cuda_test_generated_cuda_test.cu.obj nvcc fatal : Value 'sm_30' is not defined for option 'gpu-architecture' CMake Error at cuda_test_generated_cuda_test.cu.obj.Debug.cmake:216 (message): Error generating E:/MLCNN/Dlib Compiled/dlib_build/cuda_test_build/CMakeFiles/cuda_test.dir//Debug/cuda_test_generated_cuda_test.cu.obj

*** C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(231,5): error MSB6006: "cmd.exe" sali¢ con el c¢digo 1. [E:\MLCNN\Dlib Compiled\dlib_build\cuda_test_build\cuda_test.vcxproj]

Disabling CUDA support for dlib. DLIB WILL NOT USE CUDA C++11 activated. Configuring done

davisking commented 4 years ago

I pushed a change that should fix this. Try pulling from github again and giving it a try.

sergiogut1805 commented 4 years ago

I pushed a change that should fix this. Try pulling from github again and giving it a try.

Davis,

Good and bad news, Dlib compile OK with cuda 11 y cudnn 8 with the new versión but the inference is extremely slow!

The training is normal in speed but inference time is a big problem(very very slow) :(

For now i will back to Cuda 10.2 because the inference time in my projects was sink...

davisking commented 4 years ago

Are you sure you compiled with CUDA enabled? CUDA 11 shouldn't be anything but faster.

sergiogut1805 commented 4 years ago

Are you sure you compiled with CUDA enabled? CUDA 11 shouldn't be anything but faster.

Off course Davis, all the time i am using the GPU (Monitored with GPU-Z) and Cmake compile Dlib 19.20.99 OK with Cuda 11 but the inference is broken for some reason.

i trying newly with the new nvidia driver(451.67) but the problem still is present, training is ok, but inference is very very slow(using GPU to 68%) take seconds...with CUDA 10.2 take milliseconds. :/

Selecting Windows SDK version 10.0.17763.0 to target Windows 10.0.18363. Using CMake version: 3.17.3 Compiling dlib version: 19.20.99 Enabling AVX instructions Looking for cuDNN install... Found cuDNN: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.0/lib/x64/cudnn.lib Enabling CUDA support for dlib. DLIB WILL USE CUDA C++11 activated. Configuring done Generating done

davisking commented 4 years ago

Weird. I can't say what's happening.

Harry79 commented 4 years ago

With cudnn 10.1-windows10-x64-v7.5.0.56 the header cudnn.h contains a definition for CUDNN_CONVOLUTION_FWD_PREFER_FASTEST which you are using in dlib-19.20\dlib\cuda\cudnn_dlibapi.cpp.

With cudnn 11.0-windows-x64-v8.0.1.13 there is no definition for CUDNN_CONVOLUTION_FWD_PREFER_FASTEST in the header files, so it cannot work when really using cudnn 11.0.

davisking commented 4 years ago

Use the latest dlib from github.

On Jul 18, 2020, at 9:50 AM, Harald Sanftmann notifications@github.com wrote:

With cudnn 10.1 the header cudnn.h contains a definition for CUDNN_CONVOLUTION_FWD_PREFER_FASTEST which you are using in dlib-19.20\dlib\cuda\cudnn_dlibapi.cpp.

With cudnn 11.0 there is no definition for CUDNN_CONVOLUTION_FWD_PREFER_FASTEST in the header files, so it cannot work when really using cudnn 11.0.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Harry79 commented 4 years ago

Thanks! With the github version it really does work! 👍

sergiogut1805 commented 4 years ago

Are you sure you compiled with CUDA enabled? CUDA 11 shouldn't be anything but faster.

Hi Davis,

I update my Cuda 10.2 to Cudnn 8(from 7.6.5) and the problem appears (with Cudnn 7.6.5 works OK) .

I found that the problem on inference is for Cudnn and not for Cuda 11, after update to Cudnn 8 Dlib only compile with the 19.20.99 versión that you uploaded to gihub. Dlib 19.20 Fails(in the past works ok with Cuda 10.2 and Cudnn 7.6.5) and show this message:

Selecting Windows SDK version 10.0.17763.0 to target Windows 10.0.18363. The C compiler identification is MSVC 19.26.28806.0 The CXX compiler identification is MSVC 19.26.28806.0 Detecting C compiler ABI info Detecting C compiler ABI info - done Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.26.28801/bin/Hostx64/x64/cl.exe - skipped Detecting C compile features Detecting C compile features - done Detecting CXX compiler ABI info Detecting CXX compiler ABI info - done Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.26.28801/bin/Hostx64/x64/cl.exe - skipped Detecting CXX compile features Detecting CXX compile features - done Using CMake version: 3.18.0 Compiling dlib version: 19.20.0 Looking for sys/types.h Looking for sys/types.h - found Looking for stdint.h Looking for stdint.h - found Looking for stddef.h Looking for stddef.h - found Check size of void Check size of void - done Enabling SSE2 instructions Searching for BLAS and LAPACK Searching for BLAS and LAPACK Looking for pthread.h Looking for pthread.h - not found Found Threads: TRUE
Found CUDA: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.2 (found suitable version "10.2", minimum required is "7.5") Looking for cuDNN install... Found cuDNN: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.2/lib/x64/cudnn.lib Building a CUDA test project to see if your compiler is compatible with CUDA... Checking if you have the right version of cuDNN installed. ** Found cuDNN, but it looks like the wrong version so dlib will not use it. Dlib requires cuDNN V5.0 OR GREATER. Since cuDNN is not found DLIB WILL NOT USE CUDA. If you have cuDNN then set CMAKE_PREFIX_PATH to include cuDNN's folder. Disabling CUDA support for dlib. DLIB WILL NOT USE CUDA C++11 activated. OpenCV not found, so we won't build the webcam_face_pose_ex example. Configuring done**

I feel that the speed of inference is like the "Debug"profile in VS 2019 instead "Release" profile, in fact Change profiles for proof not change the speed(in the past "Release" profile boost the speed dramaticaly).

Can you check this?

thanks.

haris-ahmed commented 4 years ago

Hi @davisking having the same issue with cuda_11.0.2_451.48_win10 cudnn-11.0-windows-x64-v8.0.2.39 latest dlib master branch 19.20.99

using face_recognition library which depends on dlib results are following:

get-face-locations = 53.546 seconds get-face-locations = 10.918 seconds get-face-locations = 51.725 seconds get-face-locations = 1.935 seconds get-face-locations = 1.451 seconds

everything was working perfect with

cuda_10.2.89_441 cudnn-10.2-windows10-x64-v7.6.5.32 dlib-19.19.99

@sergiogut1805 can you please confirm have you downgraded just cudnn or cuda and cudnn both

Thank you.

sergiogut1805 commented 4 years ago

Hi @davisking having the same issue with cuda_11.0.2_451.48_win10 cudnn-11.0-windows-x64-v8.0.2.39 latest dlib master branch 19.20.99

using face_recognition library which depends on dlib results are following:

get-face-locations = 53.546 seconds get-face-locations = 10.918 seconds get-face-locations = 51.725 seconds get-face-locations = 1.935 seconds get-face-locations = 1.451 seconds

everything was working perfect with

cuda_10.2.89_441 cudnn-10.2-windows10-x64-v7.6.5.32 dlib-19.19.99

@sergiogut1805 can you please confirm have you downgraded just cudnn or cuda and cudnn both

Thank you.

Hi,

Cuda 11 with Cudnn 8 BROKEN. Cuda 10.2 with Cudnn 8 BROKEN. Cuda 10.2 with Cudnn 7.6.5 WORKS PERFECT.

rafale77 commented 4 years ago

Hi @davisking having the same issue with cuda_11.0.2_451.48_win10 cudnn-11.0-windows-x64-v8.0.2.39 latest dlib master branch 19.20.99 using face_recognition library which depends on dlib results are following: get-face-locations = 53.546 seconds get-face-locations = 10.918 seconds get-face-locations = 51.725 seconds get-face-locations = 1.935 seconds get-face-locations = 1.451 seconds everything was working perfect with cuda_10.2.89_441 cudnn-10.2-windows10-x64-v7.6.5.32 dlib-19.19.99 @sergiogut1805 can you please confirm have you downgraded just cudnn or cuda and cudnn both Thank you.

Hi,

Cuda 11 with Cudnn 8 BROKEN. Cuda 10.2 with Cudnn 8 BROKEN. Cuda 10.2 with Cudnn 7.6.5 WORKS PERFECT.

Don't know about this... I have been compiling with cuda 11.0.194 and cudnn8.0.2 with no problem...

davisking commented 4 years ago

I suspect what was happening was cuDNN was being asked "what is the best algorithm to run?". That's a slow operation in cuDNN 8.0 and was in particular was being called when input tensor sizes changed. I'm not sure what exactly you are running, but maybe it was this problem. I just pushed a change to github that should fix this issue. So let me know if it's all good now or not.

facug91 commented 4 years ago

I was having the same problem about performance. Some tests with MMOD were running extremely slow with CUDA 11 and cuDNN 8. It improved a lot with those modifications you added Davis, but it is still slower than before.		dlib w/ CUDA 10.0 & cuDNN 7.6 & gcc 7.5	dlib 23b9abd w/ CUDA 11.0 & cuDNN 8 & gcc 9.3	dlib c90362d w/ CUDA 11.0 & cuDNN 8 & gcc 9.3
Images w/ different sizes	10.9562 ms	76.5453 ms	14.1931 ms
Video HD	10.4630 ms	446.5271 ms	13.4084 ms

I know those numbers are quite weird to read, because you can't know what I'm running exactly, but they give some idea about how worst new versions of CUDA and cuDNN are working at the moment.

haris-ahmed commented 4 years ago

Hi,

Cuda 11 with Cudnn 8 BROKEN. Cuda 10.2 with Cudnn 8 BROKEN. Cuda 10.2 with Cudnn 7.6.5 WORKS PERFECT.

I also tried above combinations and got the same results last combination is working perfect.

davisking commented 4 years ago

@facug91 I wouldn't expect the recent change to make running on video any different, since each video frame is the same size and so the cuDNN algorithm selection code wouldn't run then either. Or are your video frames of non-uniform size? Are you timing the construction of the network object too (or calling clear())? That would explain it. Although that wouldn't explain why it's suddenly faster with the new changes.

Try looking at these fields

            int forward_algo;
            int backward_data_algo;
            int backward_filters_algo;

that are impacted by that recent change. What's happening is we ask cuDNN what algorithm for running convolutions is fastest given the current hardware and settings, and we use that one. cuDNN 8.0 has updated how they make that selection. Maybe it sucks for your case and we should select the algorithm another way. For instance, you can try hard coding the selections cuDNN 7 would have used and run them with v8 and see if it's better.

sergiogut1805 commented 4 years ago

Hi Davis,

With the latest versión in github the inference is very near to the past and is OK, good work!

but, have an issue yet, the first Inference is broken(absolutely slow), i mean for example i use the "dnn_semantic_segmentation_ex" code and the times are these:

Found 57 images, processing... inference time: 0.64 Min 1(1).jpg : aeroplane - hit enter to process the next image inference time: 0.00441667 Min 1(2).jpg : aeroplane - hit enter to process the next image inference time: 0.0039 Min 1.jpg : aeroplane - hit enter to process the next image inference time: 0.0052 Min 10(1).jpg : aeroplane - hit enter to process the next image inference time: 0.00416667 Min 10(2).jpg : aeroplane - hit enter to process the next image inference time: 0.00391667 Min 10.jpg : aeroplane - hit enter to process the next image inference time: 0.00521667 Min 100(1).jpg : aeroplane - hit enter to process the next image inference time: 0.00391667 Min 100(2).jpg : aeroplane - hit enter to process the next image inference time: 0.0039 Min 100.jpg : aeroplane - hit enter to process the next image inference time: 0.00495 Min 101(1).jpg : aeroplane - hit enter to process the next image inference time: 0.00441667 Min 101(2).jpg : aeroplane - hit enter to process the next image inference time: 0.00391667 Min 101.jpg : aeroplane - hit enter to process the next image inference time: 0.00521667 Min 102(1).jpg : aeroplane - hit enter to process the next image inference time: 0.0039 Min 102(2).jpg : aeroplane - hit enter to process the next image

The first image processing is very very slow! and in a services enviroment the first image is normally the unique, so is a big problem yet.

The others image processing times(after the first) are fast and ok.

With Cudnn 7.6.5 the first image processing is equal fast like others...

can you fix this?

Thank you!

davisking commented 4 years ago

@sergiogut1805 Try it now. I just pushed a change that more aggressively caches calls to the cuDNN v8 "which algorithm should we use" methods. So dlib will now call them much less often, and should only make calls when it's really worth while to do so now.

facug91 commented 4 years ago

Hi Davis I process the video in a way that almost every frame differs in size from the previous one, that's why it was affecting me. I've been running various tests today, and that cache you pushed did it. Now it is running faster with CUDA 11 and cuDNN 8. The video, for example, is running below 10 ms. Thank you!

sergiogut1805 commented 4 years ago

@sergiogut1805 Try it now. I just pushed a change that more aggressively caches calls to the cuDNN v8 "which algorithm should we use" methods. So dlib will now call them much less often, and should only make calls when it's really worth while to do so now.

Works perfect!

Thanks Davis.

dlib-issue-bot commented 4 years ago

Warning: this issue has been inactive for 35 days and will be automatically closed on 2020-09-25 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

dlib-issue-bot commented 4 years ago

Warning: this issue has been inactive for 42 days and will be automatically closed on 2020-09-25 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

dlib-issue-bot commented 4 years ago

Notice: this issue has been closed because it has been inactive for 45 days. You may reopen this issue if it has been closed in error.

linus-jansson commented 3 years ago

I pushed a change that should fix this. Try pulling from gitHub again and giving it a try.

Maybe a dumb question, but is this still a fix that works in dlib 19.22? Also getting


-- Building a CUDA test project to see if your compiler is compatible with CUDA...
-- *****************************************************************************************************************
-- *** CUDA was found but your compiler failed to compile a simple CUDA program so dlib isn't going to use CUDA.
-- *** The output of the failed CUDA test compile is shown below:
-- ***   Change Dir: C:/Users/limpan/Downloads/dlib-19.19/build/temp.win-amd64-3.9/Release/dlib_build/cuda_test_build
   ***
   ***   Run Build Command(s):C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/MSBuild/Current/Bin/MSBuild.exe ALL_BUILD.vcxproj /p:Configuration=Debug /p:Platform=x64 /p:VisualStudioVersion=16.0 /v:m && Microsoft (R) Build Engine version 16.11.1+3e40a09f8 for .NET Framework
   ***   Copyright (C) Microsoft Corporation. All rights reserved.
   ***
   ***     Checking Build System
   ***     Building NVCC (Device) object CMakeFiles/cuda_test.dir/Debug/cuda_test_generated_cuda_test.cu.obj
   ***     nvcc fatal   : Value 'sm_30' is not defined for option 'gpu-architecture'
   ***     CMake Error at cuda_test_generated_cuda_test.cu.obj.Debug.cmake:216 (message):
   ***       Error generating
   ***       C:/Users/limpan/Downloads/dlib-19.19/build/temp.win-amd64-3.9/Release/dlib_build/cuda_test_build/CMakeFiles/cuda_test.dir//Debug/cuda_test_generated_cuda_test.cu.obj
   ***
   ***
   ***   C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(241,5): error MSB8066: Custom build for 'C:\Users\limpan\Downloads\dlib-19.19\dlib\cmake_utils\test_for_cuda\cuda_test.cu;C:\Users\limpan\Downloads\dlib-19.19\dlib\cmake_utils\test_for_cuda\CMakeLists.txt' exited with code 1. [C:\Users\limpan\Downloads\dlib-19.19\build\temp.win-amd64-3.9\Release\dlib_build\cuda_test_build\cuda_test.vcxproj]
   ***
   ***
-- *****************************************************************************************************************
-- Disabling CUDA support for dlib.  DLIB WILL NOT USE CUDA

No clue what it means or how to fix it.

setup:

Windows 11
dlib-19.22
Cuda: v11.4
cudnn 11.4: x64-v8.2.4.15

Trying to build for Python 3.9.7 with py setup.py install

linus-jansson commented 3 years ago

No idea how I fixed it, but I went to take something to eat, I came back, recompiled and everything compiled as it should.

Thanks :)

alejandrosatis commented 2 years ago

Working in Ubuntu 18 with CUDA 11.1 and cudnn 8.2.1

facug91 commented 2 years ago

I have a new problem with the latest versions of cuDNN, I think it may be related to this. Even if for a given run the algorithm doesn't change during execution anymore. it does vary from run to run, and it varies so much that execution time can differ by up to 35% from one to another using MMOD (which causes my CI pipelines to fail since they check that times don't differ too much from previous runs). Currently I'm using the Docker image nvidia/cuda:11.4.2-cudnn8-devel-ubuntu20.04 as base and it seems to be getting worse with every new version. I know it might not be a dlib problem, but maybe we could do something to stabilize it, or manually set which algorithm we want to use.

davisking commented 2 years ago

Try calling set_dnn_prefer_smallest_algorithms()

facug91 commented 2 years ago

Well, that'd make it more stable, but always slower, which is not a great solution. I've been looking at how other libraries solve it. From what I understand, OpenCV and TensorFlow still use the old cudnn 7 function, and PyTorch has a parameter called "exhaustive search", which defines if it will use the new cudnn 8 function or not.
Maybe we could do something similar to PyTorch here in dlib. I could make a PR which adds a macro, which could be configured via CMake.

davisking commented 2 years ago

@facug91 a macro isn't a super great option. I can especially see python users complaining since macros are essentially unavailable to them (without a rebuild). If you want to make a PR that adds something like set_dnn_prefer_smallest_algorithms() that sets some specific setting that would be cool. Then someone can just call that new function at runtime to ask for it.

davisking / dlib

Trying to compile dlib 19.20 with cuda 11 and cudnn 8, or cuda 10.1 and cudnn 7.6.4 #2100

Expected Behavior

Current Behavior

include "cudnn_version.h"