isl-org / DPT

Dense Prediction Transformers
MIT License
1.96k stars 254 forks source link

Infer time after conversion and ram usage #57

Open romil611 opened 2 years ago

romil611 commented 2 years ago

Nvidia's TensorRT is a handy tool improve inference time. However on converting DPT to TensorRT, the inference actually went up by almost 750%. The onnx itself took a lot of time during inference. For making the onnx file I had changed all the unflatten function to view. If you have any leads on how to improve the inference time or improve the conversion process to the onnx then please share. Also, the ram usage is quite high. If you have suggestions on alternate functions to improve ram usage then do suggest.

shivin101 commented 2 years ago

Hi can you let me know which TensorRT version and ONNX opset you used for the conversion to TensorRT

romil611 commented 2 years ago

NVIDIA Jetson Xavier NX (Developer Kit Version) L4T 32.6.1 [ JetPack 4.6 ] Ubuntu 18.04.6 LTS Kernel Version: 4.9.253-tegra CUDA 10.2.300 CUDA Architecture: 7.2 OpenCV version: 4.4.0 OpenCV Cuda: YES CUDNN: 8.2.1.32 TensorRT: 8.0.1.6 OPSet Version 11 Vision Works: 1.6.0.501 VPI: ii libnvvpi1 1.1.12 arm64 NVIDIA Vision Programming Interface library Vulcan: 1.2.70