tensorflow / models

Models and examples built with TensorFlow
Other
76.77k stars 45.85k forks source link

Low GPU and CPU Usage while Inference / realtime detection #3136

Open gustavz opened 6 years ago

gustavz commented 6 years ago

System information

Describe the problem

I am using the SSD Mobilenet for realtime inference with a Webcam as Input using OpenCV and i get following Performance: Laptop: ~25 fps at ~40% GPU and ~25% CPU Usage Jetson: ~5 fps at ~5-10% GPU and 10-40% CPU Usgae

Any hints why the Object Detection API is so slow on Inference. Training may be easy and fast ok, but inference / really using the models for realtime object detection is very slow and does not use full GPU. (For comparison YOLO with darknet runs at 90-100% GPU Usage with 3x higher fps)

Here is a screenshot what nvidia-smi and top give me while inferencing on the laptop screenshot from 2018-01-10 15-40-12

cy89 commented 6 years ago

@jch1 @tombstone is the performance at expected levels?

gustavz commented 6 years ago

would also be nice if someone could tell me how to properly call optimize_for_inference.py on the pre-trained ssd_mobilenet_v1_coco frozen Model. I was choosing image_tensor as Input Node and detection_boxes,detection_scores,num_detections,detection_classes as Output Nodes. The script compiled. But using the optimized graph failed. See this Question for more details: https://stackoverflow.com/questions/48212068/error-using-model-after-using-optimize-for-inference-py-on-frozen-graph

This would certainly increase my inference performance :) !

ghost commented 6 years ago

I have a similar issue. Trying to run a Mask RCNN model on a openCV webcam feed, but only 10% of the GPU is being utilized. Any tips on how to increase GPU utilization?

rocking5566 commented 6 years ago

I also have a similar issue. In, Tensorflow 1.5, very low GPU util and run slower than CPU. screenshot from 2018-03-22 10 46 38

However, in Tensorflow 1.4. The GPU util is slightly higher than 1.5, which makes FPS is same as running on CPU. screenshot from 2018-03-22 10 49 58

This is my code https://gist.github.com/rocking5566/a284bebf5f39640d6eae6f744f74c2d2

heethesh commented 6 years ago

Similar issue on GTX1050, GPU usage is around 10~15%.

When I run SSD detector continuously in a loop (with no other processes or additional delays), GPU-Util is around 40-42% and FPS is around 20.

However, when I run SSD detector with some delay between each call (around 100-200ms, in real case, I have multiple threads accessing the detect function, hence the small delay), GPU-Util drops down to 15% and FPS is around just 10.

Please suggest on how to increase GPU usage.

CT83 commented 6 years ago

@heethesh Same problem here on 1050 Ti. 9-10% GPU usage. What is happening? 🤷‍♂️

getchhan commented 5 years ago

i have faced the same problems. how can i solve it??? any help appreciated please

lf-openthos commented 5 years ago

I have the same problems. While using pure CPU, about 15% CPU usage and low FPS.

ashwini-git commented 4 years ago

how could the response time be improved if the model is hosted at k8s cluster and its accessible through request.post() . There are post processing done in my senario , but just the model response is collected , still the ssd_inception_v2 model takes 2-3 seconds of time.

prabh27 commented 3 years ago

any updates on the issue? I face the same problem. GPU utilization is 4% - 9%

MaloM-CVision commented 3 years ago

any updates ? same problem here, utilization is between 1 and 3%

dariazhuk357 commented 2 years ago

Same problem. Any fixes are appreciated.

Nepomoceno commented 8 months ago

Same problem here

ipa-anm-sy commented 3 months ago

same problem here, any solution?