ain-soph / trojanzoo

TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classification in deep learning.
https://ain-soph.github.io/trojanzoo
GNU General Public License v3.0
274 stars 62 forks source link

Clean label attack accuracy is wrong #164

Closed ArshadIram closed 1 year ago

ArshadIram commented 1 year ago

I am trying to reproduce the clean label backdoor attack and getting this accuracy

image

ain-soph commented 1 year ago

clean label is a baseline that is expected to not work.

I guess you are referring to label consistent attack, which is work from Madry’s group.

ArshadIram commented 1 year ago

Yes, Label consistent attack. Could not understand Madry's group?

ain-soph commented 1 year ago

https://arxiv.org/abs/1912.02771

https://github.com/ain-soph/trojanzoo/blob/main/trojanvision/attacks/backdoor/clean_label/label_consistent.py

ArshadIram commented 1 year ago

Even refool attack success rate is very lower:

image

ain-soph commented 1 year ago

That is recently changed in https://github.com/ain-soph/trojanzoo/commit/c2fd73e3e9380514f09020a93b8e216703a6a3ee for the mark alpha value.
We believe it's expected to not work. If you find any solution to make it perform well, we may have a discussion on that.

Here's my notes on Refool: https://ain-soph.github.io/trojanzoo/trojanvision/attacks/backdoor/clean_label.html#trojanvision.attacks.Refool