CMU-Perceptual-Computing-Lab / openpose

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
https://cmu-perceptual-computing-lab.github.io/openpose
Other
30.77k stars 7.83k forks source link

The cuDNN still have a problem when compatibility with NVIDIA Geforce RTX 3080 #1757

Closed yingxuanshi closed 3 years ago

yingxuanshi commented 3 years ago

Issue Summary

Hi, We got the issue of "out of memory" error. We found that this is usually related to "cuDNN not being used". Could you please double-checked that VS project was built with "enable cuDNN" in CMake? We have reinstalled cuDNN but nothing change happened. Then we used "Dependencies"/"DependencyWalker" on "caffe.dll" in the bin folder and found that it was depending on cudnn64_7.dll in our older computer(RTX2080) installation of openpose. However, in this newest openpose 1.7.0.1 installation, caffe.dll is not depending on any "cudnn*.dll"--although there is indeed a "cudnn64_8.dll" in openpose's bin folder.

Thank you so much!

OpenPose Output (if any)

image

Only run 5 images then stopped.

Type of Issue

You might select multiple topics, delete the rest:

Your System Configuration

  1. Whole console output (if errors appeared), paste the error to PasteBin and then paste the link here: LINK "The openpose only detected 5 pictures under this media folder and then stopped with the error below "

D:\study\openpose\openpose-1.7.0>build\x64\Release\OpenPoseDemo.exe -image_dir examples\media Starting OpenPose demo... Configuring OpenPose... Starting thread(s)... Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0. F1118 07:47:46.885848 3300 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory Check failure stack trace:

  1. OpenPose version: Latest GitHub code? Or specific commit (e.g., d52878f)? Or specific version from Release section (e.g., 1.2.0)?

We downloaded the last version -- "OpenPose v1.7.0" (8ca5c1d)

  1. General configuration:

    • Installation mode: CMake, sh script, manual Makefile installation, ... (Ubuntu); CMake, ... (Windows); ...? I downloaded CMAKE in windows10 and then compiler was no errors
    • Operating system Windows 10
    • Operating system version Windows 10
    • Release or Debug mode? Release mode (x64)
    • Compiler (gcc --version in Ubuntu or VS version in Windows): Windows 10 ; VS2019 community,
  2. If GPU mode issue:

    • CUDA version (cat /usr/local/cuda/version.txt in most cases): CUDA 11.1.1
    • cuDNN version: cuDNN v8.0.5(November 9th,2020) for CUDA11.1
    • GPU model (nvidia-smi in Ubuntu):

NVIDIA-SMI 456.81 Driver Version: 457.09 CUDA Version: 11.1 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce RTX 3080 WDDM | 00000000:01:00.0 On | N/A | | 0% 51C P8 34W / 340W | 654MiB / 10240MiB | 2% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

gineshidalgo99 commented 3 years ago

Where is your cuDNN in your OS? This sounds like VS is picking the CUDA from your computer rather than the one in the OpenPose DLLs, thus you'll have to make sure you have cuDNN in that CUDA installation?

linjia03 commented 3 years ago

I meet the same problem.

F1129 21:21:10.444409 9840 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory Check failure stack trace:

My program works fine with openpose1.6 on GTX 2080Ti. But it stop working when I changed into GTX 3070. After update to the lastest version of display driver, Cuda, cudnn, and openpose, this error comes out.

hakanjansson commented 3 years ago

FWIW, I also noticed cuDNN does not seem to be used when I'm running the latest portable Windows version (1.7.0) from here: https://github.com/CMU-Perceptual-Computing-Lab/openpose/releases/download/v1.7.0/openpose-1.7.0-binaries-win64-gpu-python-flir-3d_recommended.zip

It does seem to be used when using the 1.6.0 version from here, though: https://github.com/CMU-Perceptual-Computing-Lab/openpose/releases/download/v1.6.0/openpose-1.6.0-binaries-win64-gpu-flir-3d_recommended.zip

I checked the OpenPoseDemo.exe's dependencies and also used Process Explorer to see what dlls were used:

Version 1.7.0 used dlls included: openpose\bin\cublas64_11.dll openpose\bin\cublasLt64_11.dll openpose\bin\cudart64_110.dll openpose\bin\curand64_10.dll

Version 1.6.0 used dlls included: openpose\bin\cublas64_100.dll openpose\bin\cudart64_100.dll openpose\bin\cudnn64_7.dll openpose\bin\curand64_100.dll

Let me know if there's anything I can do to help!

drostifrosti commented 3 years ago

... and if one wants to build Caffe on Windows by ticking BUILD_CAFFE, it has no effect: https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/a255747af22116ad76004437456bb531dc5d0b23/CMakeLists.txt#L581-L585

liuwei0812 commented 3 years ago

I meet the same problem. how to use cudnn on version 1.7.0

JmyL commented 3 years ago

Me too. "out of memory" is popped up. Openpose consumes more than 7GB ram before it crashes. This means that cuDNN isn't activated, but cuDNN is still there... I don't know why. Should I buy expensive rtx 3090 for more ram space?

openpose 1.7.0, rtx 3070, ubuntu 20.04, with nvidia-driver-460 driver, cuda 11.1.1, cudnn 8.0.5. I set CUDA_ARCH_BIN and CUDA_ARCH_RTX both to '86' and set CUDA_ARCH to 'Manual' by CMake configuration.

JmyL commented 3 years ago

It's probably related to caffe, not to openpose itself, isn't it? I think that I should test nvcaffe or other custom caffe builds. If it works, i'll let you all know.

benrubin commented 3 years ago

I am having the same problem when running compiled examples: syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory

I assume that the error is because cuDNN is not being activated correctly, and I have been trying to resolve this for several weeks now without success. I would greatly appreciate any advice on how to resolve this.

My setup:

I'm compiling OpenPose from the source so that I can use the Python API. My compile completes successfully, but I get some warnings during the process. Some of the warnings look as though they might explain the problem with cuDNN, but I'm out of my depth here and I don't know how to resolve the underlying issues. Here are the warnings I get from make -jnproc that look most pertinent:

I get a bunch of these while generating:

CMake Warning at tools/CMakeLists.txt:14 (add_executable):
  Cannot generate a safe runtime search path for target extract_features
  because files in some directories may conflict with libraries in implicit
  directories:

    runtime library [libcudnn.so.8] in /usr/lib/x86_64-linux-gnu may be hidden by files in:
      /usr/local/cuda-11.1/lib64

  Some of these libraries may not be found correctly.

After generating is done, I get this:

CMake Warning:
  Manually-specified variables were not used by the project:

    CUDA_ARCH_BIN

and then while building and linking, I get lots of these: /home/rubinb/openpose/3rdparty/caffe/include/caffe/util/cudnn.hpp:21:10: warning: enumeration value ‘CUDNN_STATUS_VERSION_MISMATCH’ not handled in switch [-Wswitch]

I'll be grateful for any assistance with this -- thanks!

benrubin commented 3 years ago

openpose 1.7.0, rtx 3070, ubuntu 20.04, with nvidia-driver-460 driver, cuda 11.1.1, cudnn 8.0.5. I set CUDA_ARCH_BIN and CUDA_ARCH_RTX both to '86' and set CUDA_ARCH to 'Manual' by CMake configuration.

Hello JmyL -- it sounds like we have very similar configurations and we are experiencing similar issues. I also suspect that the issue is related to Caffe -- perhaps that somehow Caffe is not getting compiled correctly with cuDNN. I will be very interested to hear what happens with your Caffe tests!

JmyL commented 3 years ago

Hello benrubin, I agree with you. before testing nvcaffe, I've done several other things first. I've installed cuda 11.0 and I soon realized that cuda 11.1 is oldest version cuda which can handle rtx 3000 series...

I'll test nvcaffe following this article - http://peter-uhrig.de/openpose-with-nvcaffe-in-a-singularity-container-with-support-for-multiple-architectures/

I have some questions to you, to see how may similarity we have.

  1. I use my rtx3070 as a egpu, through thunderbolt. Do you?
  2. Have you tested cudnn sample? I've just tested it and it runs without problem. You can check it on /usr/src/cuddn_samples_v8/ folder, if you installed libcuddn-samples... package.
benrubin commented 3 years ago

Hi JmyL,

I use my rtx3070 as a egpu, through thunderbolt. Do you?

No -- I actually have an older 8GB GPU: GTX 1080, and mine is internal (PCIe bus in a desktop PC). It appears to be still supported for CUDA 11.1, but I wonder if I'm running into complications because it is an older architecture (Pascal).

Have you tested cudnn sample?

No, and that's a great idea. I will do that and let you know what happens. I will be away from my studio for a few days, so I will probably do this early next week.

JmyL commented 3 years ago

Yep, thank you benrubin. Here is the link that you can follow when you test cuddn sample: https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#verify

moncio commented 3 years ago

I am having the same problem when trying to compile CAFFE (doing make runtest) previuosly to Openpose:

F0130 12:18:01.974212 19426 cudnn_deconv_layer.cu:21] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0)  CUDNN_STATUS_BAD_PARAM
*** Check failure stack trace: ***
    @     0x7f55a6f1b1c3  google::LogMessage::Fail()
    @     0x7f55a6f2025b  google::LogMessage::SendToLog()
    @     0x7f55a6f1aebf  google::LogMessage::Flush()
    @     0x7f55a6f1b6ef  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f55a72cf21a  caffe::CuDNNDeconvolutionLayer<>::Forward_gpu()
    @     0x5654dde1312c  caffe::Layer<>::Forward()
    @     0x5654ddf1e045  caffe::CuDNNDeconvolutionLayerTest_TestSimpleCuDNNDeconvolution_Test<>::TestBody()
    @     0x5654de36cfe1  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @     0x5654de364fed  testing::Test::Run()
    @     0x5654de365128  testing::TestInfo::Run()
    @     0x5654de365235  testing::TestCase::Run()
    @     0x5654de36578c  testing::internal::UnitTestImpl::RunAllTests()
    @     0x5654de365867  testing::UnitTest::Run()
    @     0x5654ddd9e947  main
    @     0x7f559e5d40b3  __libc_start_main
    @     0x5654ddda5e9e  _start

I have the next warning during the compilation:

/env/libs/openpose/3rdparty/caffe/include/caffe/util/cudnn.hpp: In function 'const char* cudnnGetErrorString(cudnnStatus_t)':
/env/libs/openpose/3rdparty/caffe/include/caffe/util/cudnn.hpp:21:10: warning: enumeration value 'CUDNN_STATUS_VERSION_MISMATCH' not handled in switch [-Wswitch]
   21 |   switch (status) {
      |          ^

My setup is for a docker image:

Host Nvidia graphic card: GeForce RTX 2080 Ti Host Nvidia driver version: 450.102.04

Then for docker build: SO: Ubuntu 20.04 CUDA: 11.0 CUDNN: 8.0.5 Openpose: latest - 1.7.0

Someone can help me or tell me the feasibility of this environment?

Thanks in advance

gineshidalgo99 commented 3 years ago

cuda/cudnn issues are usually solved with reinstalling the Nvidia drivers, then CUDA, then cuDNN, then reinstalling/recompiling OpenPose.

moncio commented 3 years ago

yes, @gineshidalgo99 , but for this case using trough a docker image? my base image is taken from this: https://hub.docker.com/r/nvidia/cuda/tags?page=1&ordering=last_updated (nvidia/cuda:11.0.3-cudnn8-devel-ubuntu20.04). So, how can I "reinstall" or "recompile" for this particular case? Is there any example to compile Openpose for a docker image?

benrubin commented 3 years ago

cuda/cudnn issues are usually solved with reinstalling the Nvidia drivers, then CUDA, then cuDNN, then reinstalling/recompiling OpenPose.

Thanks, @gineshidalgo99 -- I tried that sequence a few times, but some Nvidia version conflict had embedded itself deeply somewhere, so I finally reinstalled Ubuntu 20.04 and installed everything fresh, and OpenPose is finally working -- whew!

Have you had any luck, @JmyL?

gineshidalgo99 commented 3 years ago

For Windows, this is my Caffe repo to compile Windows: https://github.com/gineshidalgo99/caffeCompilerForWindowsAndCUDA It is based on the Windows Caffe one.

I have not been able to compile cuDNN for Windows, it keeps giving me this error:

[Many other logs]
'OpenPoseDemo.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudnn_ops_infer64_8.dll'. Module was built without symbols.
'OpenPoseDemo.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudnn_cnn_infer64_8.dll'. Module was built without symbols.
F0207 11:36:55.959534  5612 cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0)  CUDNN_STATUS_NOT_INITIALIZED
Unhandled exception at 0x00007FFFA5DD286E (ucrtbase.dll) in OpenPoseDemo.exe: Fatal program exit requested.

The program '[7240] OpenPoseDemo.exe' has exited with code 0 (0x0).

If anybody is able to get it to work without giving the CUDNN_STATUS_NOT_INITIALIZED error, I'd very highly appreciate some hints of the exact CUDA/cuDNN version and/or instructions to get it to work! :)

Please, continue this discursion in #1845, to centralize messages and hopefully focus efforts to fix the issue. Thanks!

PS: For Ubuntu users with memory issues, v1.7.0 was modified to allow cuDNN 8, which was a pain. I am not an expert, so I am sure there must be a better way to run the cuDNN convolutions using less memory, but I am not an expert on it. I am very open to suggestions about the cudnn_conv implementation to minimize memory: https://github.com/CMU-Perceptual-Computing-Lab/caffe/blob/master/src/caffe/layers/cudnn_conv_layer.cpp

Please, continue this discursion in #1864, to centralize messages and hopefully focus efforts to fix the issue. Thanks!