megvii-research / FQ-ViT

[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
Apache License 2.0
301 stars 48 forks source link

为什么运行浮点模型和量化后模型推理时间差不多 #39

Closed liuxy1103 closed 1 year ago

liuxy1103 commented 1 year ago

我注释掉test_quant.py中的 model.model_quant(),运行时间还是和之前一样,这是怎么回事

linyang-zhh commented 1 year ago

We use FakeQuantization in our implement and to simulate the quantization inference in FP32.

So, the inference speed is not accelerated.