Closed XXXVincent closed 2 years ago
@XXXVincent how did you generate erfnet_quantized_int8.cache? thanks
@XXXVincent how did you generate erfnet_quantized_int8.cache? thanks
Using code from this repo : https://github.com/Wulingtian/yolov5_tensorrt_int8_tools, it's a tensorrt quantization tool written in python, and the cache file looks pretty normal
How is that possible when I specified a none-existsed calib file and still get a decent result? However when not specifying a calib file, the result infered by exported int8 model is totally wrong?
_&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=lannet_20220308.onnx --calib=a_not_exists_file.notexist --int8 --explicitBatch --saveEngine=test_not_exists.engine_
And also, I found the results of int8 model infered between Linux(RTX2060) and QNX platform are quite different. So weird.
The command I used for exporting int8 model is: _trtexec --onnx=lannet_20220308.onnx --calib=calibint8.bin--int8 --explicitBatch --saveEngine=int8.engine
And the commands I used for int8 model inference is:
trtexec --loadEngine=int8.engine --exportOutput=result.json --duration=0.005 --iterations=1 --avgRuns=1 --loadInputs='input.1':img.bin --int8
@ttyio
@XXXVincent When no calib file is provided, trtexec simply use random dynamic ranges for all tensors. That's why you got wrong outputs.
Could you share the ONNX file, which TRT version, which OS, and which GPU(s) you used? Thanks
Closing due to >14 days without activity. Please feel free to reopen if the issue still exists. Thanks
Trtexec works fine without specifying a int8 cache file, but throws a error when loading int8 cache file.
/usr/src/tensorrt/bin/trtexec --onnx=erfnet.onnx --int8 --saveEngine=erf_int8.engine --calib=erfnet_quantized_int8.cache --verbose
_----- Parsing of ONNX model erfnet.onnx is Done ---- [01/04/2022-16:10:16] [V] [TRT] Original: 83 layers [01/04/2022-16:10:16] [V] [TRT] After dead-layer removal: 83 layers [01/04/2022-16:10:16] [V] [TRT] After scale fusion: 83 layers [01/04/2022-16:10:16] [V] [TRT] After vertical fusions: 83 layers [01/04/2022-16:10:16] [V] [TRT] After final dead-layer removal: 83 layers [01/04/2022-16:10:16] [V] [TRT] After concat removal: 83 layers [01/04/2022-16:10:16] [V] [TRT] After tensor merging: 83 layers terminate called after throwing an instance of 'std::bad_alloc' what(): std::badalloc Aborted (core dumped)