Closed pycoco closed 2 years ago
@yufenglee could you give me some suggestion?
Can you provide some more details about how you are enabling int8/quantized model with onnxruntime+TensorRT EP? reference: https://github.com/microsoft/onnxruntime/issues/11873#issuecomment-1160677578 are you using calibration table or QDQ model?
I use calibration table and this problem already been fixed. thanks
I generate cache engine by onnxruntime+tensorrt EP, but the size of int8 model and fp16 model are same. But when i use trtexec to generate int8 engine, the model size seems correctly. I want to know any change when using onnxruntime+tensorrt EP to generate cache engine?
System information