naisy / realtime_object_detection

Plug and Play Real-Time Object Detection App with Tensorflow and OpenCV. No Bugs No Worries. Enjoy!
MIT License
101 stars 36 forks source link

cudnn handle: CUDNN_STATUS_INTERNAL_ERROR #64

Closed araza91 closed 5 years ago

araza91 commented 5 years ago

Building Graph Loading label map 2019-01-15 17:48:29.407913: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA Loading... 2019-01-15 17:48:29.485773: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-01-15 17:48:29.486196: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties: name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.771 pciBusID: 0000:01:00.0 totalMemory: 7.93GiB freeMemory: 7.41GiB 2019-01-15 17:48:29.486227: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0 2019-01-15 17:48:29.687857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-01-15 17:48:29.687901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0 2019-01-15 17:48:29.687906: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0: N 2019-01-15 17:48:29.688098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7146 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1) 2019-01-15 17:48:29.688565: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0 2019-01-15 17:48:29.688602: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-01-15 17:48:29.688629: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0 2019-01-15 17:48:29.688647: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0: N 2019-01-15 17:48:29.689036: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7146 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1) 2019-01-15 17:48:30.293604: E tensorflow/stream_executor/cuda/cuda_dnn.cc:352] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR Segmentation fault (core dumped)

TF version: 1.10.0 CUDA Version: 9.0 Trying to run on PC having GTX 1080.

@naisy

naisy commented 5 years ago

Hi @araza91,

If you encounter this error trying to run Mask R-CNN, it is due to insufficient memory. Please change the config.yml as follows.

model_type: 'mask_v1'
model_path: 'models/mask_rcnn_inception_v2_coco_2018_01_28/frozen_inference_graph.pb'
worker_threads: 1

If there are other causes, please tell me what you tried to do.

araza91 commented 5 years ago

@naisy thanks for looking.

I was trying SSD. I got it to work by removing ~/.nv file. SSD ran at ~45 fps. I also tested faster rcnn resnet 50 and getting ~10fps. Does that sound right to you?

naisy commented 5 years ago

Hi @araza91,

I looked at the ~/.nv directory. It seems to be related to the NVIDIA Graphics Driver. Please consider updating driver. If I run ssd_mobilenet_v1 on GTX1080, I think that at least 150 FPS or more.

araza91 commented 5 years ago

@naisy

I installed latest CUDA 9.0. I can look more into it to debug. Did you ever profile faster rcnn on GTX 1080 or any idea what kind of fps you can get? Thanks.

naisy commented 5 years ago

Hi @araza91,

faster_rcnn_resnet50_coco_2018_01_28 is around 9.1 FPS with GTX1060. So I think that 10 FPS is valid with GTX1080.