jkjung-avt / tensorrt_demos

TensorRT MODNet, YOLOv4, YOLOv3, SSD, MTCNN, and GoogLeNet
https://jkjung-avt.github.io/
MIT License
1.75k stars 548 forks source link

slow inference on jetson tx2 #23

Closed IbrahimBond closed 4 years ago

IbrahimBond commented 4 years ago

i have tested this demo on a jetson tx2 device and inference speed is at 22 fps. i expected better performance on a tx2 than a jetson nano. do you have any insights for achieving better results? and what are the expected speeds on a tx2?

anyone tried it on a tx2 and what were the results?

thanks

jkjung-avt commented 4 years ago

Which demo are you referring to? Is it the SSD one?

In addition, please also specify which version of JetPack and TensorFlow you are using.

IbrahimBond commented 4 years ago

i am referring to the SSD one. i am using tensorflow 1.14 and and jetpack 4.2 with tensorrt 5.

i have tried the trt_ssd_async.py for inference and managed to get 25 fps, but i think this is slow for an optimized model on jetson tx2.

jkjung-avt commented 4 years ago

It indeed seems too slow. Unfortunately, I don't have a TX2 to verify that currently.

Did you notice any suspicious warnings or errors when you built the TensorRT engine and ran inferencing?

IbrahimBond commented 4 years ago

it seems like the conversion went smoothly and it does improve performance by almost 50%. i have tested the model without optimizing it with tensorrt and it ran on 12-14 fps.

i am really disappointed with the mobilenetv2 model. i thought it would do better than yolov3-tiny. i am currently achieving 30 fps on yolov-tiny(416416) which is better than an optimized mobilenetv2(300 300) model.

do you have any idea why the mobilenetv2 model would not perform similarly to the numbers published by tensorflow?

i have also tested mobilenetv3 and it performs similar to mobilenetv2 (14 fps)

jkjung-avt commented 4 years ago

Please check out this discussion on StackOverflow: https://stackoverflow.com/questions/50385735/why-the-mobilenetv2-is-faster-than-mobilenetv1-only-at-mobile-device

IbrahimBond commented 4 years ago

thank you, this is very informative. then maybe in my case i am better off using the ssd inception model. what do you think?

jkjung-avt commented 4 years ago

I think it's worth digging out why FPS on your TX2 is not better than my test result on the Nano.

  1. Have you tested both USB webcam input and image/video file input? Do you observe big difference of FPS in these 2 cases?

  2. If it doesn't trouble you too much, could you profile the code on TX2 and provide the log to me? I might be able to spot problems by looking at the profiler output. Reference: https://jkjung-avt.github.io/optimize-mtcnn/ For example, run trt_ssd.py with the following command for say 60 seconds. Then copy and paste the profiler output. $ python3 -m cProfile -s cumtime trt_trt_ssd.py --model ssd_mobilenet_v1_coco \ --image \ --filename test.jpg

IbrahimBond commented 4 years ago

I am off until monday, ill get back to you then

jkjung-avt commented 4 years ago

Any update? Otherwise, I'll close this issue due to inactivity.