naisy / realtime_object_detection

Plug and Play Real-Time Object Detection App with Tensorflow and OpenCV. No Bugs No Worries. Enjoy!
MIT License
101 stars 36 forks source link

CuDNN Version Mismatch #54

Closed anandcu3 closed 5 years ago

anandcu3 commented 5 years ago

Hi Naisy,

I am trying to run your code on Jetson TX2 flashed with Jetpack 3.2, CUDA 9.0, Python 3.5, OpenCV 3.4.1, Tensorflow 1.6.0. Trying the run_video.py with the default config file. Only things I changed are

force_gpu_compatible to True (because it is TX2) and save_to_file to True

[DEBUG] time:1537443326.53708196 pid:4708 pn:Process-2  tid:548145340416 tn:MainThread fn:process_fps_console enter
Building Graph
Loading label map
Loading...
2018-09-20 17:05:50.486647: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] ARM64 does not support NUMA - returning NUMA node zero
2018-09-20 17:05:50.486817: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: 
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.67GiB freeMemory: 5.12GiB
2018-09-20 17:05:50.486881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-09-20 17:05:55.583450: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-20 17:05:55.583531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-09-20 17:05:55.583560: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-09-20 17:05:55.583737: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4128 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2018-09-20 17:05:55.585498: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-09-20 17:05:55.585661: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-20 17:05:55.585736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-09-20 17:05:55.585770: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-09-20 17:05:55.585891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4128 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2018-09-20 17:06:32.697848: E tensorflow/stream_executor/cuda/cuda_dnn.cc:396] Loaded runtime CuDNN library: 7105 (compatibility version 7100) but source was compiled with 7005 (compatibility version 7000).  If using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
2018-09-20 17:06:32.699230: F tensorflow/core/kernels/conv_ops.cc:712] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms) 
Aborted (core dumped)

Can you help me debug this issue? Which binary source is this referring to? Should I reflash the Jetson with an older version of CUDA? or Try to recompile the binary in question to the CuDNN version on the device?

Thanks

naisy commented 5 years ago

Hi @anandcu3,

It seems that it occurs because the cudnn version of the tensorflow compilation environment differs from the version of the cudnn of the execution environment. You need tensorflow compiled in the same environment as the cudnn version of the execution environment.

anandcu3 commented 5 years ago

@naisy . Yes. Thank you for your help. I recompiled tensorflow and fixed the issue. Running at around 23 fps on TX2 with Visualize set to "on" , so everything seems to be ok.