aks2203 / poisoning-benchmark

A unified benchmark problem for data poisoning attacks
https://arxiv.org/abs/2006.12557
MIT License
146 stars 21 forks source link

Attack success rate evaluation #19

Closed safiia closed 1 year ago

safiia commented 1 year ago

Hello, Thanks so much for your open source benchmark. I appreciate your work. I want to clarify something. How do you calculate the success rate? How did you get those success rate percentages?

Another related question is, I understood that the training set contains 25 poisoned samples. How about the test set? How many poisoned examples are they tested during the evaluation of the attack?

Thanks a lot

aks2203 commented 1 year ago

How do you calculate success rate?

For each test there is a single target image. If a model trained on poisoned data classifies that target image into the desired class (the label the attacker wishes to give the target image, and not its ground truth class) then the attack is a success. This measurement is binary, but over a batch of tests (each including a round of training and evaluation on a new target image) the portion of success is called the "success rate." Is this explanation clear?

How about the test set? How many poisoned examples are they tested during the evaluation of the attack?

If I'm understanding this question correctly to mean "How many images in the test set are targets of the attack?" then the answer is 1. For each test a single target image is chosen. By 2023 standards in the poisoning and backdoor literature this is a relatively simple setting for the attacker. At the time of publication, most poisoning methods had been developed or tested in this setting.

Let me know if these answers are satisfactory.

safiia commented 1 year ago

Yes thanks a lot!

aks2203 commented 1 year ago

You're welcome!