Open darrenzhang1007 opened 1 year ago
I also have some questions to ask you.
Guidance_of_QAT_performance_optimization.md
, How yolov7_qat.onnx was obtained during the Run QAT benchmark process. I guessed yolov7_qat.onnx was obtained by the process at first. Then run the qat running-code and get qat.onnx
.
Is the qat.onnx
I got already the best model?
Does this process include this step
The input of the exported onnx is 1*3*672*672
by this process
The input of the exported onnx is 1*3*640*640
by this process
This infer_eval code explicitly requires that the input dimension of the engine is 672*672
@darrenzhang1007 How yolov7_qat.onnx was obtained during the Run QAT benchmark process.
After you finish the qat training via qat.py, You could got the qat.onnx. The guidance is mainly telling you: For current version of TensorRT, User can always got the best perf on PTQ(onnx model without QDQ nodes), But QAT will not, In this way, if we want to get the same perf as PTQ in QAT model(model with QDQ nodes), We should adjust the QDQ placement to let the TensorRT have the same behavior as PTQ(they export the same graph)
The input dimensions of the onnx exported in the two tutorials are different. It makes me confuse!
you can see the https://github.com/WongKinYiu/yolov7.git, When running the evaluation, it will run with 13672*672(that will got the best accuracy, seems all the yolo will do this), we just keep aligned with them.
Very appreciate having such a good job. I want to report a bug for cmd_sensitive_analysis in qat.py
When calling the quantize.calibrate_model function, a
device
parameter is underwritten, causing the run to fail. https://github.com/NVIDIA-AI-IOT/yolo_deepstream/blob/bd731fd3e2e65d775e0086373a8d94426b3d56cd/yolov7_qat/scripts/qat.py#L243