megvii-research / FQ-ViT

[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
Apache License 2.0
301 stars 48 forks source link

can not get scale #15

Closed youdutaidi closed 2 years ago

youdutaidi commented 2 years ago

I just want to test the FQVit results in Segmentation with quantized SwinTransform backbone, and I only change the code of Mlp as following:

class Mlp(nn.Module):
    """ Multilayer perceptron."""

    def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.,quant=True,calibrate=False):
        super().__init__()
        out_features = out_features or in_features
        hidden_features = hidden_features or in_features

        self.fc1 = QLinear(
            in_features,
            hidden_features,
            quant=quant,
            calibrate=calibrate,
            bit_type=BIT_TYPE_DICT["int4"],
            calibration_mode="channel_wise",
            observer_str="minmax",
            quantizer_str="uniform"
        )

        self.act = act_layer()
        self.qact1 = QAct(
            quant=quant,
            calibrate=calibrate,
            bit_type=BIT_TYPE_DICT["uint4"],
            calibration_mode="layer_wise",
            observer_str= "minmax",
            quantizer_str="uniform"
        )

        self.fc2 = QLinear(
            hidden_features,
            out_features,
            quant=quant,
            calibrate=calibrate,
            bit_type=BIT_TYPE_DICT["int4"],
            calibration_mode="channel_wise",
            observer_str="minmax",
            quantizer_str="uniform"
        )

        self.qact2 = QAct(
            quant=quant,
            calibrate=calibrate,
            bit_type=BIT_TYPE_DICT["uint4"],
            calibration_mode="layer_wise",
            observer_str= "minmax",
            quantizer_str="uniform"
        )

        self.drop = nn.Dropout(drop) 

and I didn't modify the training code from github: https://github.com/SwinTransformer/Swin-Transformer-Semantic-Segmentation

but get result:

  File "/home/code/SwinTransformer/FQVIT/ptq/quantizer/base.py", line 45, in forward
    outputs = self.quant(inputs)
  File "/home/code/SwinTransformer/FQVIT/ptq/quantizer/uniform.py", line 30, in quant
    scale = scale.reshape(range_shape)
AttributeError: 'NoneType' object has no attribute 'reshape'

Could you please help me where I did wrong? Thanks a lot for your kindness.

linyang-zhh commented 2 years ago

I think maybe you didn't apply the calibration step (just a few forward) for that model, like here. You may check it carefully.

youdutaidi commented 2 years ago

I think maybe you didn't apply the calibration step (just a few forward) for that model, like here. You may check it carefully.

Thank you that I have takled the problem.

But another problem emerges. I find that the model after calibration and quantization ( in test_quant.py , after code "model.quant()", I tried to save the quantized model, but I find it is same as before quantization.

Will the model change the weights after quantization? Or only quantized during validation process but not stored in weights?

linyang-zhh commented 2 years ago

We implement our methods with FakeQuantization, and thus the weights are "only quantized during validation process but not stored in weights".

youdutaidi commented 2 years ago

We implement our methods with FakeQuantization, and thus the weights are "only quantized during validation process but not stored in weights".

Sorry but how can I store the quantized weight ? In my understanding, fakequantization is a method that stored quantized weight in fp32 format but with quantized value. But actually, the values didn't change after code "model.quant()"

linyang-zhh commented 2 years ago

Yes, all weights are stored as fp32 here.

However, after the calibration and quantization steps, you can store all weights as int8 after this line, using torch.tensor(outputs.clone().detach(), dtype=torch.int8)

or, you can convert all weights one by one, using

weight_q = CONV.quantizer.quant(CONV.weight)
saved_weight = torch.tensor(weight_q.clone().detach(), dtype=torch.int8)