confuse about the different results between benchmarks and real video inference

wangchangquan commented 3 years ago

The benchmark show the  nx could  get >800fps when use ssd-mobilenet-v1 models,

but when I use this model for process real video, it only get 60fps, and I use mutiple threat method to accelerate， it just get about 100fps. I also used Tensorrt under the test. so why the speed is so slowly in real video processing？
and how can I get more high process in real video ？

ak-nv commented 3 years ago

We benchmark using TensorRT tools called trtexec which performs latency for only DL model. In video processing, based on your pipeline, FPS will vary.

wangchangquan commented 3 years ago

hi ak-nv , my pipeline is very simple ,just capture camera frame ,then process it. but the speed is too slow. we use high framerate camera that capture moving object, the camera'fps up to 120. we want to process this capture on real time. when we see the result of benchmark ,we beleive nx will perfectly competent, but it didn't. so do you have any suggestion to improve the speed of nx under realistic scenarios？

imneonizer commented 3 years ago

@wangchangquan,

You should also benchmark your video processing pipeline to see how much FPS you can actually get. Video decoding is usually the bottle neck in these scenarios, Try ti use Gstreamer Pipeline. If you are using OpenCV then you can refer to this link

Since the trtexec tool will only benchmark latency in forward pass through the model (which does not includes pre-processing / post-processing).

You can actually load a single frame into the memory and pass it again and again to see the actual performance of your model with the pipeline. With this You can easily figure out the bottlenecks in your application.

If you want the best performance out of Jetson Devices make sure Your application utilizes more GPU as compared to CPU. As a handy tool you can use jetson-stats to monitor the resource utilization.

Finally I will suggest you to look in DeepStream if you want to utilize the full potential of the Jetsons.

NVIDIA-AI-IOT / jetson_benchmarks

confuse about the different results between benchmarks and real video inference #15