aditya-dl / RetinaFace-TensorRT-Python

Deploy RetinaFace algorithm using TensorRT in Python
MIT License
9 stars 5 forks source link

The Tensorrt version is slower than the pytorch version #1

Open TulipDi opened 3 years ago

TulipDi commented 3 years ago

Thanks for your work. I get Retinaface-Resnet50 tensorrt follow your code, but it is slower than Pytorch_Retinaface. Have you done this comparision?

aditya-dl commented 3 years ago

Hi TulipDi, could you share details around the GPU that you are testing on? Can you also share your code snippet for comparison?

TulipDi commented 3 years ago

@aditya-dl Thanks for your reply. The details when run trt version. image The datails when run pytorch version image There are two version code for inference.

aditya-dl commented 3 years ago

I tried using it again on Jetson Nano and the TensorRT code is giving me better performance than PyTorch. What is the FPS of each inference on your end? I will also try removing the PyTorch dependencies for confirmation.

TulipDi commented 3 years ago

this is my details:

  1. image: 1920x1080
  2. backebone: resnet50
  3. device: Jetson Xavier NX

The performance about TensorRT: net forward time: 1.0154 (1, 3, 1080, 1920) net forward time: 0.9984 (1, 3, 1080, 1920) net forward time: 0.9738 (1, 3, 1080, 1920) net forward time: 0.9889 (1, 3, 1080, 1920) net forward time: 0.9580 (1, 3, 1080, 1920) net forward time: 0.9814 (1, 3, 1080, 1920) net forward time: 0.9493 (1, 3, 1080, 1920) net forward time: 0.9565 The performance about pytorch: torch.Size([1, 3, 1080, 1920]) net forward time: 0.6907. torch.Size([1, 3, 1080, 1920]) net forward time: 0.6863. torch.Size([1, 3, 1080, 1920]) net forward time: 0.6913. torch.Size([1, 3, 1080, 1920]) net forward time: 0.6856. torch.Size([1, 3, 1080, 1920]) net forward time: 0.6922. torch.Size([1, 3, 1080, 1920]) net forward time: 0.6856. torch.Size([1, 3, 1080, 1920]) net forward time: 0.6854. torch.Size([1, 3, 1080, 1920]) net forward time: 0.6920. torch.Size([1, 3, 1080, 1920])

aditya-dl commented 3 years ago

I think I have found the problem. I am running PyTorch on CPU instead of GPU. Setting it to GPU will reduce the inference time. I am also planning to reduce the dependency of PyTorch in the code. I will update the code base. Thanks for pointing it out.