MadryLab / robustness

A library for experimenting with, training and evaluating neural networks, with a focus on adversarial robustness.
MIT License
917 stars 181 forks source link

The loss-aggregation for the attacker should be `sum` not `mean` #115

Open rsokl opened 2 years ago

rsokl commented 2 years ago

https://github.com/MadryLab/robustness/blob/a9541241defd9972e9334bfcdb804f6aefe24dc7/robustness/attacker.py#L195

Assuming that you are solving for per-datum perturbations, and not a broadcasted (or uniform) perturbation, then the loss-aggregation performed prior to backprop should be sum, and not mean. Using mean, the gradient of each perturbation in the batch is scaled by the inverse batch size, whereas the perturbation's gradient should be independent of batch size. Obviously, this does not effect methods where the gradient is normalized.

cdluminate commented 2 years ago

mean = sum / N, and thus partial mean / partial input = (1/N) partial sum / partial input. As PGD use the sign of gradient, we have sign(partial mean / partial input) = sign((1/N) partial sum / ..) = sign(partial sum/..). So mean lead to the same result as sum.

rsokl commented 2 years ago

Right, as I stated "obviously, this does not effect methods where the gradient is normalized." The point is that this happens to not affect methods like FGSM because of the signed gradient, but other methods would yield the incorrect behavior.

cdluminate commented 2 years ago

Indeed. As long as sign(grad) is not in the update equation, it will trigger weird bugs for people who want to customize new algorithms.