Closed feather0011 closed 4 years ago
Sorry, I notice that attack_PGD
is for evaluation only(not for PGD training), and in evaluation, if a test image is vulnerable with weak adversarial perturbation, the strong adversarial perturbation is not needed to be computed.
Hi, thank you for the great work and opening the code.
However I have a question about PGD evaluation.
In the code, when
attack_pgd
is called, it seems that for some images in a batch, adversarial perturbation is gained with less steps thanattack_iter
.During the iteration, update on perturbation 'delta' are performed to the images those are classified correctly only. (
index
is the variable that indicate the images that are classified correctly and indelta
, onlydelta[index[0]]
is updated in the loopfor _ in range(attack_iters):
)I understand that the Image that are classified correctly are not adversarial example, so more search in l-inf ball should be perform to seek adversarial perturbation.
However, I don't understand why the search should be stopped for the Images which are classified wrongly in the early step of PGD iteration.
I think it can be expected that more strong adversarial perturbation can be searched by performing more gradient descent iteration even if the images are adversarial already. In other word, I doubt that evaluation on PGD are performed with relatively weak adversarial examples.
These maybe the adversarial examples with less distant from original one(not exactly but approximately), but not strong adversarial examples. And I think the strength of adversarial example is crucial because the main claim of paper is that training with FGSM can build model that are robust to strong attack such as PGD.
I think that something like
max_delta[all_loss >= max_loss] = delta.detach()[all_loss >= max_loss]
in the loopfor zz in range(restarts):
should be performed in the loopfor _ in range(attack_iters):
to find the strongest adversarial example that can be achieved withattack_iter
steps.But of course, I may be missing something. So can you tell me the underlying idea about why the iteration stop when the image are classified wrongly while building PGD perturbation?