Closed vvabi-sabi closed 1 year ago
It's a numpy issue and also this looks like running in the CPU. the sample is demonstrate the GPU inference of TensorRT.
That's right, inference takes place on the gpu. Nevertheless, the selection of objects in the image (bbox) occurs using the "numpy" and "opencv". If the network finds a lot of objects in the image, the postprocessing takes longer.
Looks like a numpy perf issue, not a TRT perf issue.
Closing since no activity for more than 3 weeks, please reopen if you still have question, thanks!
Description
YOLOv3 postprocessing. np.vectorize() reduces image processing performance
Environment
TensorRT Version: 8.0.1-1 NVIDIA GPU: GV10B (Jetson Xavier NX) NVIDIA Driver Version: 32.6.1 CUDA Version: cuda10.2 CUDNN Version: 8 Operating System: Ubuntu 18.04.6 LTS Python Version (if applicable): 3.6.9 Tensorflow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if so, version):
Relevant Files
NVIDIA/TensorRT/samples/python/yolov3_onnx/data_processing.py
it looks weird but "np.vectorize" reduces inference performance
Steps To Reproduce
replace "np.vectorize" functions with:
this will add 1 to 5 fps in image processing