aks2203 / poisoning-benchmark

A unified benchmark problem for data poisoning attacks
https://arxiv.org/abs/2006.12557
MIT License
146 stars 21 forks source link

Black-box setting #17

Closed ArshadIram closed 1 year ago

ArshadIram commented 1 year ago

Hi, I like to thank you for providing a benchmark for the fair analysis. However, I would like to know more about black-box settings. You mentioned in the paper that we craft poison using the known model and tested it on the two unknown models, averaging the results.

I do not clearly understand this setting. It would be nice if you clear it.

Is the dataset is known to attacker?

Many thanks

aks2203 commented 1 year ago

Hi there, Yes, the dataset is known to the attacker. In the CIFAR-10 case: The victim's model architecture, however, is not known. So the attacker uses a ResNet-18 to craft poisons, but to evaluate, we assume the victim is using a MobileNet or a VGG so we test both and average the results. With TinyImagenet, the attacker uses a VGG to craft poisons, and evaluation is done on a ResNet-34 and a MobileNet. Does that help?

ArshadIram commented 1 year ago

So in a black box setting. Dataset is known to the attacker. The attacker craft the poison instances, which are provided to the victim. In the white box set. Dataset is known, and the victim model is also known.

These settings are consistent in transfer learning and training from scratch?

aks2203 commented 1 year ago

No, for white-box tests in the transfer learning benchmarks, we use the same frozen feature extractor that is given to the attacker for evaluation.

However, since the victim is training from a random initialization, there is no white-box level. When training from scratch, there is only one situation. The attacker uses a ResNet-18, and the poisoned dataset is evaluated by averaging the attacks success rate when training models of three architectures (including ResNet-18). This example is for CIFAR-10.

Does that clarify things?

ArshadIram commented 1 year ago

Many thanks for your answers.