Closed ggeor84 closed 5 years ago
Hey @ggeor84, I also noticed this mistake shortly after adding the extra evaluation step after the last quantization step. I discovered that I pruned the weights in the reverse order. This eliminates the final accuracy drop but using this method I only achieve around 66% accuracy instead of 69% (I pushed it to this repository now). However I have not spend too much time trying to find the right learning rate scheduling and the amount of epochs for each quantization step. I believe the code now correctly reflex the paper's algorithm but I have removed the results section from the README as I have not been able to replicate their accuracy (note if you look at the issues of the code of the authors, people also struggle the replicate the exact results without the given hyperparameters)
Thank you @Mxbonn for your contributions. While, I was not able to replicate the paper's results, I was able to get closer: Baseline: Acc@1 69.758 Acc@5 89.076 Quantized: Acc@1 69.660 Acc@5 89.044 Note that the authors of INQ do not quantize bias and bn parameters. I modified your code to skip over those parameters. Other than that, key to the success is your fix of quantizing first larger weights. However, this still seems counter-intuitive to me as I would have expected the other way to work better. Thank you so much for your work
@ggeor84 That comes pretty close to the original paper! Did you modify any of the hyperparameters or only remove quantizing the bias and bn parameters? Feel free to submit a PR if you want ;) I think the reason that quantizing the larger weights works better is because due to the non-uniform quantization (powers of 2) the distance of the largest values to their quantized value is the largest and therefore should be quantized first.
I didn't. I used the parameters as you have set them. I'll submit a PR in a couple of days. I made the modifications in my repository so I need to test them first in yours before i do the PR.
just submitted a PR. Once again, thank you for your contributions!
After the last iterative step, I get a good accuracy (say 69.6%), however, following that, there is a final quantization step where 100% of the weights are quantized. Accuracy drops to 40%. You can replicate this, by just running the code without any changes.