Gwinhen / BackdoorVault

A toolbox for backdoor attacks.
MIT License
19 stars 3 forks source link

Some problem in dataset.py #1

Open RorschachChen opened 1 year ago

RorschachChen commented 1 year ago

in line 205-207, I think for attack like refool. The logic of the current implementation is for front n_poison samples, randomly find a non-target sample and make it become a poisoned sample by adding triggers. Actually, when randomly selecting clean sample from other classes, till the end of the training, every clean sample can be transformed into a poisoned sample, so the actual poison rate could be 100%. By deactivate this, I find the ASR of refool_smooth and refool_ghost, the current version ASR could be 100%, while after I modify these lines, the ASR is around 90%, I think it is reasonable. If you have question, we can discuss.

Gwinhen commented 1 year ago

Thanks for the comment!

I think refool transforms a clean target-class sample into a poisoned sample. Line 209 explicitly selects target-class samples.

But I do see your point. I think for refool, the poisoning rate argument poison_rate shall not use 0.1. Otherwise, for datasets like CIFAR-10 that have 10 classes, the actual poisoning rate on the target class will be 100% (as the target class has 1/10 samples of the whole dataset). I would suggest to change argument poison_rate to a smaller value, e.g., 0.01, which is 10% of the target-class samples (for CIFAR-10).

Let me know if this addresses your concern.

RorschachChen commented 1 year ago

Thanks for so fast response, okay, I know that clean-label style. But I still think the poisoned indices could be determined before iterating the dataloader, since I am currently test my defense work which isolates poisoned samples. Let me know if you want me to close the issue, it's okay.

Gwinhen commented 1 year ago

Yes, it would probably be more transparent for controlling the samples for poisoning if the indices are determined beforehand.

Thanks for the suggestion! I will take a look at how to incorporate this.