google-research / uda

Unsupervised Data Augmentation (UDA)
https://arxiv.org/abs/1904.12848
Apache License 2.0
2.17k stars 313 forks source link

This is not WideResNet-28-2 #77

Closed jason718 closed 4 years ago

jason718 commented 4 years ago

The model for cifar10 and svhn is not the standard WideResNet? (1) When filter sizes don't match, this code use avg_pooling and zero_pad to deal with it, while WideResNet use 1x1 conv layer. Because of this, there is only 25 conv layers, not even 28. https://github.com/google-research/uda/blob/602483dbca113567b32e7395e5c0eadd3cf7e776/image/randaugment/wrn.py#L74 (2) There is one more skip connection across multiple blocks. https://github.com/google-research/uda/blob/602483dbca113567b32e7395e5c0eadd3cf7e776/image/randaugment/wrn.py#L155

michaelpulsewidth commented 4 years ago

Hi, thanks for pointing out! We adopted the architecture file from here and the author of the file confirmed that these changes do not lead to different results. I have updated the file so that it's the standard WRN-28-2.

I experimented with the original architecture and the updated, standard architecture on CIFAR-10 and SVHN and do not see a difference. Here is the detailed comparisons (average of 5 runs for the original file and average of 10 runs for the updated file):

CIFAR-10 (original / updated): 4000: 95.50 +- 0.15 / 95.68 +- 0.08 2000: 95.21 +- 0.11 / 95.27 +- 0.14 1000: 95.11 +- 0.16 / 95.25 +- 0.10 500: 94.93 +- 0.10 / 95.20 +- 0.09 250: 94.52 +- 0.22 / 94.57 +- 0.96

SVHN (original / updated): 4000: 97.76 +- 0.05 / 97.72 +- 0.10 2000: 97.92 +- 0.04 / 97.80 +- 0.06 1000: 97.71 +- 0.07 / 97.77 +- 0.07 500: 97.73 +- 0.08 / 97.73 +- 0.09 250: 97.36 +- 0.13 / 97.28 +- 0.40

Note that the results are obtained with EMA implemented. Please do a git pull to update your file.

I am closing this issue. Feel free to reopen if you have further questions.