Is this a complete model quantization process?

yanghr / BSQ

BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization (ICLR 2021)

Apache License 2.0

36 stars 9 forks source link

Thanks for your interest in this work! You understanding is correct, currently this work only considers the quantization of the weight and activation in the CONV and FC layers, while keeping the BN layer as full precision.

The major contribution of BSQ lies in dynamically determining the the bitwidth of each layer through the training process. In practice the fusing of BN can be done before the BSQ training process, then the entire model can be trained with the same process. We will take a closer look on this issue as we extend and improve this paper into a journal submission in near future. I agree that adding the BN fusing would make this method more practical in achieving a mixed precision model that can be processed entirely with fixed point processor.

yanghr / BSQ

Is this a complete model quantization process? #1