Open ganjuzhizihuai opened 4 months ago
Sorry for the delayed response. My code will store the raw weights instead of the quantized weights. You can scale the saved floating-point weights with the saved s
and then round them to get quantized ones.
Or you can modify the code to also save the quantized weights.
Hello, when I run the code, I print the parameter information of the quantization model. Why is the parameter type of the model still float32 after replacing the quantization layer?
Quantized Layer: layer3.2.conv1 Weight dtype: torch.float32 Weight range: -0.37524235248565674 to 0.42818304896354675 Quant scale: Parameter containing: