yhhhli / APoT_Quantization

PyTorch implementation for the APoT quantization (ICLR 2020)
265 stars 51 forks source link

Accuracy of Implementation #5

Closed creaitr closed 4 years ago

creaitr commented 4 years ago

Hi, I have a question for the reported accuracy.

For example, you got 70.75% and 66.46% with 5 bits and 2 bits for ResNet-18 on ImageNet, respectively.

In the paper, however, 70.9% and 67.3% with 5 bits and 2 bits are reported.

Can you explain what has made these differences?

yhhhli commented 4 years ago

Hi,

The difference comes from the hyper-parameters. For clipping thresholds of weights or activation, they actually need a different learning rate and weight decay from the network weights. We tune these hyper-parameters a lot to find a better result. In this repo, we just use the same hyper-params for both alpha and the network weights

creaitr commented 4 years ago

Thank you for your clear answer!