NVIDIA / tensorrt-laboratory

Explore the Capabilities of the TensorRT Platform
https://developer.nvidia.com/tensorrt
BSD 3-Clause "New" or "Revised" License
261 stars 50 forks source link

I found that using tensorrt for inference takes more time than using tensorflow directly on GPU #24

Closed jlygit closed 5 years ago

jlygit commented 5 years ago

hi, I found that using tensorrt for inference takes more time than using tensorflow directly on GPU.

when i use tensorrt to infer a 720 video and it takes 600ms per frames. The Memory-Usage is as follow. image

when i use tensorflow directly on GPU to infer a 720 video and it takes 236ms per frames. The Memory-Usage is as follow.

image

it sinces tensorrt did not make full use of memory.

how can i config that and speed it when using TRT ?

ryanolson commented 5 years ago

The memory amount of memory is not likely to be the problem. TF will gobble up all the memory on the GPU and own it. It uses internal allocators for its work. TenorRT is actually very efficient in its memory usage and gives the user very explicit control.

The likely problem is either: 1) data loading 2) synchronous execution of the TRT engine 3) only executing 1 IExecutionContext instead of multiple.

It's hard to know without seeing the details of your code.

I'd also advise you to use these resources as well:

NVIDIA's DeepStream is an integrated video -> tensorrt workflow. This seems like it's best suited to solve your problem: https://developer.nvidia.com/deepstream-sdk

Additionally, the best place to get support for TensorRT questions is from the Developer Forum: https://devtalk.nvidia.com/default/board/304/tensorrt/