microsoft / onnxruntime-inference-examples

Examples for using ONNX Runtime for machine learning inferencing.
MIT License
1.07k stars 312 forks source link

Issues with Running 'e2e_tensorrt_resnet_example.py' Example #407

Open TouqeerAhmad opened 3 months ago

TouqeerAhmad commented 3 months ago

Hello, I am trying to run the end-to-end quantization example for ResNet50 using TRT EP and facing couple of issues. The first problem I am facing is 'compute_range' attribute is not available in 'MinMaxCalibrater'.

Screenshot 2024-03-29 at 8 50 04 AM

When I comment the 'write_calibration_table(calibrator.compute_range())', I get another which is again probably due to the first one as It shows an exception: "Failed to read INT8 calibration table calibration.flatbuffers"

Screenshot 2024-03-29 at 8 50 59 AM

was wondering if somebody have an insight, thanks!

TouqeerAhmad commented 3 months ago

I was finally able to run it by setting up an environment with an older version of onnxruntime-gpu, specifically 1.31.1

TouqeerAhmad commented 3 months ago

I have tried running this example with onnxruntime-gpu==1.14.0, and tensorrt==8.5.1.7, I see a reduced top1 accuracy of 0.710 when using TensorrtExecutionProvider, compared to 0.739 when I use CUDAExecutionProvider. However, the inference time for TensorrtExecutionProvider is more than CUDAExecutionProvided which I was not expecting.

Any advise?

PS: The ResNet50 tutorial with TensorRT significantly lags documentation. Also, these needs to be updated with latest onnxruntime and tensorrt versions or at least list information at the time of writing which versions were used.