ashafahi / free_adv_train

Official TensorFlow Implementation of Adversarial Training for Free! which trains robust models at no extra cost compared to natural training.
https://arxiv.org/abs/1904.12843
173 stars 30 forks source link

Already Implemented In cleverhans repo #1

Closed 2prime closed 5 years ago

2prime commented 5 years ago

https://github.com/tensorflow/cleverhans has BACKPROP_THROUGH_ATTACK attribute is exactly what your idea in your paper

ashafahi commented 5 years ago

Thanks for giving us the opportunity for clarifying that this is not the case.

There are many reasons that what our work proposes is novel: we propose simultaneously updating to the perturbation and the network's weights at no extra cost using the idea of mini-batch replay. What we propose allows us to adversarially train models at No additional cost compared to natural training. What we do is very different from simply back-prop'ing through the attack which as mentioned in cleverhans official repo increases the training cost!

The BACKPROP_THROUGH_ATTACK is parameter which is by default set to False. If False, tf.stop_gradient is called on the adversarial example generation operation. Once set True, it expands the computation graph and therefore increases the cost of training as also mentioned in Cleverhans official repo.
This is because setting that parameter to True doesn't have to do anything with simultaneous updates for the adversarial example and weight. Also, as mentioned again in the official repo, this parameter does not do anything for FGSM based attacks such as L-infinity BIM or PGD which iteratively call FGSM (see here). As mentioned in their multi-gpu implementation of the official repo, it is hard-coded to be False because, with it, they can't get the speedup.