Combatting and detecting FGSM and PGD adversarial noise.

jngannon commented 5 years ago

Defense: Linear output layer(no softmax), trained with quadratic cost, weights pruned based on smallest mean activated value.

Write-up: https://jngannon.github.io/FGSM_Article

Authors: James Gannon (james.gannon.82@gmail.com)

Code: https://github.com/jngannon/robustml-test-analysis

Does the code implement the robust-ml API and include pre-trained models: Yes

Claims: Fully connected networks: 78.2% robust accuracy against epsilon=0.1

dtsip commented 5 years ago

Thank you for your submission! Would it be possible to refine your claims? The claims should correspond to the robust accuracy of your strongest model (be it fully-connected or convolutional) against different values of epsilon (preferably 1 to avoid cluttering the table). Specifically, we don't create separate entries for each attack (FGSM or PGD) and we don't include the improvement over the original baseline. All we are listing is the claimed robust accuracy, e.g., "80% robust accuracy against epsilon=0.3". (Robust accuracy corresponds to the accuracy against the worst case attack you can come up with.)

While not technically an obstacle for including the defense to RobustML, we noticed a few worrisome issues with the defense:

The performance of PGD is not improving as you allow larger values of epsilon. This might indicate that PGD is not properly tuned (it may need larger step size or more steps).
FGSM is often outperforming PGD especially for larger epsilon values. This might be an issue of not properly tuning PGD (see previous point) or the model having obfuscated gradients (see https://arxiv.org/abs/1802.00420).
The performance of the defense in terms of robust accuracy appears to be below other entries in the table.

You may want to consider addressing these issues before listing the defense here.

jngannon commented 5 years ago

Thanks for the notes, I appreciate the feedback. I will address these issues as soon as I can and tidy up the presentation and claims.

jngannon commented 5 years ago

I have fixed up the claims. The slight difference to the results are from using all 10,000 datapoints in the mnist for the accuracy test compared to 1,000 for generating plots and tables in the blog, and the problem with generating PGD noise(used a bigger learning rate). Thanks for your patience reading this, it is independent research and I don't have anyone to give me any feedback.

anishathalye commented 5 years ago

Your changes look pretty good!

Could you rephrase your claim just in terms of perturbation bound rather than saying "accuracy against PGD noise"? While using PGD is a good way to evaluate a defense, we shouldn't require the attacker use a specific attack method, just that their bound is within the given threat model.

Also, could you make the code (self._threat_model = robustml.threat_model.Linf(epsilon=0.2)) match your stated claim?

jngannon commented 5 years ago

I have made a couple of corrections, please excuse the late reply, I have been travelling. Thanks

dtsip commented 5 years ago

Thank you for your edits. Added the defense to the preprints table.

robust-ml / robust-ml.github.io

Combatting and detecting FGSM and PGD adversarial noise. #3