Open faizan1234567 opened 3 weeks ago
Thank you so much.
I have one more question. I have built the engine file with half precision and I am getting about 8.5 FPS on Jetson Orin Nano. How can I optimize it? I am thinking to built INT8 file but I am not sure how to calibrate it and built it successfully, do you have any relevant document/examples for it? I have two inputs for image fusion model.
@lix19937 could i deploy with deepstream or python runtime, I am confused. I don't know which will improve the efficiency.
@faizan1234567
You can use fp16, then use PTQ(int8) to improve infer speed.
Follow is PTQ sample include cpp/python impl, https://github.com/lix19937/trt-samples-for-hackathon-cn/tree/master/cookbook/03-BuildEngineByTensorRTAPI/MNISTExample-pyTorch
More optimize methods see https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#optimize-performance
@lix19937 thank you again, any guidelines on using NVIDIA deep stream based video analytics system?
@lix19937 thank you so much :)
Description
Environment
TensorRT Version: 8.5
CUDA Version: 11.4
CUDNN Version: 8.6
Operating System:
Python Version (if applicable): 3.8.10
PyTorch Version (if applicable): 2.1.0a0+41361538.nv23.6
My implementation