Harry24k / adversarial-attacks-pytorch

PyTorch implementation of adversarial attacks [torchattacks]
https://adversarial-attacks-pytorch.readthedocs.io/en/latest/index.html
MIT License
1.86k stars 348 forks source link

Generated labels #71

Closed dimasquest closed 2 years ago

dimasquest commented 2 years ago

Is there a simple method to get newly generated labels (e.g. using the least_likely_attack) NOT from the model prediction, but rather from the atk method itself (for adversarial training)?

e.g. current labels are extracted using model(adv_images)

adv_images = atk(images, labels)
outputs = model(adv_images)

desired:

adv_images, adv_labels = atk(images, labels)
Harry24k commented 2 years ago

Thank you for your suggestion. Returning the predictions of adversarial labels is obviously useful, but it consumes one more forward for obtaining predictions of adversarial images. Thus, instead of atk.forward(), I added it to atk.save() as an argument save_pred. https://github.com/Harry24k/adversarial-attacks-pytorch/blob/6dbe9155b0ba6ff966f2d484366c13fcbf80e38d/torchattacks/attack.py#L149