Thank you for your contributions in the field of model quantification and generously sharing your code.
I encountered a problem where I ran a quantitative example and obtained the corresponding top 1 and top 5 data using the same W and A, which were the same as the author's results, but the q I saved_ Model. pth, open using Netron and see that it is still in float32 format. How can I save the quantization model correctly.
Thank you for your contributions in the field of model quantification and generously sharing your code.
I encountered a problem where I ran a quantitative example and obtained the corresponding top 1 and top 5 data using the same W and A, which were the same as the author's results, but the q I saved_ Model. pth, open using Netron and see that it is still in float32 format. How can I save the quantization model correctly.
The code I use to save the results is:
Save. pth format:
torch. save (q_model, 'my/save/path')
Save onnx format:
dummy_ Input=torch. randn ((1, 3, 224, 224)). to (device) torch. onnx. export (q_model, dummy input, onnx_path, verbose=False, inputnames=['input '], outputnames=['output'], opset_version=11)
The pytorch version I am using is: torch==1.11.0+cu113 torchvision==0.12.0+cu113 timm==0.4.12