rwightman / pytorch-nips2017-attack-example

A PyTorch baseline attack example for the NIPS 2017 adversarial competition
https://www.kaggle.com/c/nips-2017-targeted-adversarial-attack
Apache License 2.0
84 stars 27 forks source link

L2 distance between adversarial example and the original input data #4

Open kkew3 opened 6 years ago

kkew3 commented 6 years ago

In attacks.AttackCarliniWagnerL2._optimize there's:

if input_orig is None:
    dist = l2_dist(input_adv, input_var, keepdim=False)

The problem is that input_var has already been mapped to tanh-space, so it's in fact not the original input. However, the adversarial example input_adv is the one to be used. Therefore, without mapping input_var back to its original space, the dist calculated won't be the true L2 distance between the adversarial example and the original image. In Carlini's code he performed the exact operation. Thanks for checking!