Closed chrissmiller closed 4 years ago
Hey, thanks for your question. If you are experiencing catastrophic overfitting when using a higher epsilon, you can lower your step size until you no longer overfit. I ran our code for your particular example, epsilon=12/255, and found that with alpha=13/255 (rather than 15/255), the model does not catastrophically overfit, and gets 47% PGD accuracy, and 74% clean accuracy. Let me know if you have any further questions on this.
I'm using the FGSM approach to train a ResNet18 model on CIFAR10.
Using the values in the paper for epsilon=8/255 and alpha=10/255 works fine. But when I try to extend to an epsilon of 12 (and an alpha of 1.25*epsilon as outlined in the paper, so 15) to compare to other robust models, the model catastrophically overfits relatively early with very low clean example accuracy (50 to 60%). Has anyone had success using this approach with a higher epsilon than 8/255? Does alpha=1.25*epsilon not apply for other values of epsilon?
Thanks in advance for any help you can provide.