gustavz / realtime_object_detection

Plug and Play Real-Time Object Detection App with Tensorflow and OpenCV
MIT License
281 stars 148 forks source link

Hi Gustav I have questions! #35

Closed sankim90 closed 5 years ago

sankim90 commented 6 years ago

First, I was amazed at your work. It fits perfectly in my work.

I am working at JetsonTX2 & DrivePX2, and as you know, there is a speed issue.

I got information about the various works and github.

  1. https://github.com/tensorflow/models/issues/3270
  2. https://devtalk.nvidia.com/default/topic/1028234/jetson-tx2/low-gpu-usage-with-tensorflow-inference-on-jetson-tx2/
  3. https://devtalk.nvidia.com/default/topic/1027819/jetson-tx2/object-detection-performance-jetson-tx2-slower-than-expected/

Q1. How can you achieve 30 fps at SSD mobilenet JetsonTX2? AS mentioned (1), you manually assigning the CNN related nodes on GPU and the rest nodes on CPU at tensorflow? How?

Q2. Have you experimented with other frameworks? I have experimented with openCV DNN (SSD-mobilenet), caffe (SSD-mobilenet), darknet (YOLO v2, v3) and tensorflow (SSD-mobilenet).

However, i got performance up to only 9 fps.

Do you think the above frameworks lacks the ability to optimize GPU / CPU allocation?

Thank you

gustavz commented 5 years ago

Q1: The problem is that the tensorflow NMS implementation is not running fast on gpu, therefore i go through all layers/nodes and place the ones connected to the NMS on CPU, which does it much faster.

Q2: No, only darknet, which wperforms well also. But still slower than my approach.