Open Ironteen opened 3 years ago
Thanks for your interest in this work! You understanding is correct, currently this work only considers the quantization of the weight and activation in the CONV and FC layers, while keeping the BN layer as full precision.
The major contribution of BSQ lies in dynamically determining the the bitwidth of each layer through the training process. In practice the fusing of BN can be done before the BSQ training process, then the entire model can be trained with the same process. We will take a closer look on this issue as we extend and improve this paper into a journal submission in near future. I agree that adding the BN fusing would make this method more practical in achieving a mixed precision model that can be processed entirely with fixed point processor.
Hi, hanrui, I am very interested in the ideas of this paper, but I have a question as following: In general, a complete model quantization includes