I'm sorry, why did I run the example and still get a quantitative model of float32?

Thank you for your contributions in the field of model quantification and generously sharing your code.

I encountered a problem where I ran a quantitative example and obtained the corresponding top 1 and top 5 data using the same W and A, which were the same as the author's results, but the q I saved_ Model. pth, open using Netron and see that it is still in float32 format. How can I save the quantization model correctly.

The code I use to save the results is:

Save. pth format:

torch. save (q_model, 'my/save/path')

Save onnx format:

dummy_ Input=torch. randn ((1, 3, 224, 224)). to (device) torch. onnx. export (q_model, dummy input, onnx_path, verbose=False, inputnames=['input '], outputnames=['output'], opset_version=11)

The pytorch version I am using is: torch==1.11.0+cu113 torchvision==0.12.0+cu113 timm==0.4.12

zkkli / RepQ-ViT

I'm sorry, why did I run the example and still get a quantitative model of float32? #7