yhhhli / APoT_Quantization

PyTorch implementation for the APoT quantization (ICLR 2020)
269 stars 51 forks source link

The migration of this QAT function? #7

Open xieydd opened 4 years ago

xieydd commented 4 years ago

Thanks for your great work! But Compared with Resnet series for Imagenet, i will be more careful about some small model like mobilenet series or shufflenet series. And have you test the QAT function in some small model, is it useful?

yhhhli commented 4 years ago

Hi,

Thanks for your interest in our work! We tested it on MobileNet V2 before, and the results are good too, I will update the results ASAP.

xieydd commented 4 years ago

Thanks for your response, have it any other trick on mbv2 model? If no, can i test my model via your already released code?

yhhhli commented 4 years ago

No other tricks are needed. Just note that the inverted residual blocks do not use relu in the second point-wise conv layer. So we should use symmetric quantization for activations in the next layer, however, this might not be supported on some hardware because generally activations are unsigned numbers. You can also use unsigned activation quantization for all layers, but the accuracy will decrease a little.

yhhhli commented 4 years ago

P.S. you might need to use LabelSmooth loss function and cosine decay for learning rate in mbv2 QAT.

xieydd commented 4 years ago

Very clear, thanks again, i will test my model, and post my result in this issue.