Modified Square Attack to Randomized Defense

fra31 / auto-attack

Code relative to "Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks"

MIT License

628 stars 111 forks source link

Hi,

sorry for the late reply.

Square Attack accepts a candidate update if it improves the loss. Then for randomized defenses the idea is to compute the average loss (e.g. here) over multiple forward passes, according to the EoT principle, instead of just one as usual, and accept the update if the average loss is better than the current best one. In this way, one might mitigate the randomness of the forward pass. Also, I'd remove the early stopping (if misclassification is achieved for some point then the attack is not run on it anymore) since for randomized defense it might help to maximize the confidence or margin of misclassification.

These modifications are not implemented in the current version, since overall didn't lead to strong improvement over APGD.

Let me know if you have further questions!

fra31 / auto-attack

Modified Square Attack to Randomized Defense #56