locuslab / fast_adversarial

[ICLR 2020] A repository for extremely fast adversarial training using FGSM
434 stars 92 forks source link

Model overfits with low test accuracy for higher epsilon values #4

Closed chrissmiller closed 4 years ago

chrissmiller commented 4 years ago

I'm using the FGSM approach to train a ResNet18 model on CIFAR10.

Using the values in the paper for epsilon=8/255 and alpha=10/255 works fine. But when I try to extend to an epsilon of 12 (and an alpha of 1.25*epsilon as outlined in the paper, so 15) to compare to other robust models, the model catastrophically overfits relatively early with very low clean example accuracy (50 to 60%). Has anyone had success using this approach with a higher epsilon than 8/255? Does alpha=1.25*epsilon not apply for other values of epsilon?

Thanks in advance for any help you can provide.

leslierice1 commented 4 years ago

Hey, thanks for your question. If you are experiencing catastrophic overfitting when using a higher epsilon, you can lower your step size until you no longer overfit. I ran our code for your particular example, epsilon=12/255, and found that with alpha=13/255 (rather than 15/255), the model does not catastrophically overfit, and gets 47% PGD accuracy, and 74% clean accuracy. Let me know if you have any further questions on this.