megvii-research / FQ-ViT

[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
Apache License 2.0
301 stars 48 forks source link

Confused about Fake Quantization #30

Closed jhss closed 1 year ago

jhss commented 1 year ago

In the issue, you said that quant and dequant is necessary step for quantization.

You recommended to reference the Figure 6 in the paper.

1

What I don't understand is that fake quantization is actually FP32 matrix multiplication as above figure according to your recommended paper. However, in your paper, you said that your model is Fully-Quantized Vision Transfomer (FQ-VIT), which seems to contradict that you actually use FP32 matrix multiplication except LayerNorm and Softmax.

Also, Fake quantization is usually used during quantization-aware training. However, your method is post-training quantization, so I don't understand why dequant is necessary step for post-training quantization. I looked at the other paper code, but there was no place to use dequant during post-training quantization.

I appreciate if you answer the question.

linyang-zhh commented 1 year ago

We use FakeQuantization to rapidly develop our algorithm, and our FakeQuantization and the integer-only inference are numerically equivalent.