Closed aashish-kumar closed 5 years ago
Hi,
Thanks for your interests in our paper and code. Indeed sometimes Scratch B is performing better than unpruned networks. This could be due to the following reasons:
For CIFAR-10 experiments, we report the average of five runs. For ImageNet, each result is from one run. It is safe to say that the difference in CIFAR-10 is statistically significant.
Hello,
Thanks for an interesting paper. I was looking at the accuracy of Scratch B models compared to big-unpruned networks and it seems Scratch B is performing better than unpruned networks most of the time. This seems to be counter intuitive as the bigger network if could be trained effectively should outperform smaller networks. Do you think the difference is statistically significant?
Thanks