Open zjysteven opened 2 years ago
Hi, I used a pre-trained ResNet-20 from this repository . I also got a worse result. I believe pre-trained models also matter. Therefore, it would be appreciated if the author could release the pre-trained model on CIFAR10 and CIFAR100.
Hi,
I have a question and some clarification would be appreciated.
The CIFAR-10 experiment in the paper used ResNet-20 as the architecture, which is much smaller than the commonly-used ones like WRN-28-10 or DenseNet-100. Is there any particular reason for choosing ResNet-20 here? Also, when I use GradNorm on a WRN-28-10 I get worse results than the reported ones on ResNet-20 (I'm using the last fc layer gradients here). By any chance did you observe this as well?
Thanks