Every time I run the code, the the values of ASR are different and vary widely

reds-lab / Narcissus

The official implementation of the CCS'23 paper, Narcissus clean-label backdoor attack -- only takes THREE images to poison a face recognition dataset in a clean-label way and achieves a 99.89% attack success rate.

https://arxiv.org/pdf/2204.05255.pdf

MIT License

105 stars 12 forks source link

Every time I run the code, the the values of ASR are different and vary widely #3

Closed vivien319 closed 1 year ago

vivien319 commented 1 year ago

Hi,I'm very interested in the methodology of your paper.But every time I run the code, the results of ASR are quite different(78.14%,83%,95.22%,100%...)，whether it's with your trigger (resnet18_trigger.npy) or the trigger I've optimized by using your code （best_noise_xxxx.npy）. I would like to know the reasons for the difference in the experimental results,is it due to the existence of the random seed or some other reason? whether the results of the experiment in your paper were obtained by your trigger(resnet18_trigger.npy)?Can I get the similar results if I use a different trigger optimized by using your code?Or should I try a few more times to find the best result as a final criterion?

I'm looking forward to your answer，Thanks！

pmzzs commented 1 year ago

We have indeed noticed that different random seeds can lead to different poisoned indices, which might result in varying ASRs. If more "important samples" get poisoned due to the seed variation, the ASR can increase, and the inverse can occur as well. Moreover, we observed some degree of randomness brought about by various versions of PyTorch and hardware differences, which may also influence the ASR.

To answer your question about the triggers: Yes, some experimental results in our paper were obtained using our trigger (resnet18_trigger.npy), but there are also some experiments using other triggers. It's also possible to get similar results using a different trigger optimized with our code. However, achieving consistency might require several runs due to the aforementioned variability.

We are currently conducting further research to minimize this inconsistency and we appreciate your patience. If you come across any discoveries or workarounds, we would be thrilled to hear from you.

vivien319 commented 1 year ago

Thanks for your detailed explanation! However, I also found that if I just used testing attack effect block in your code ,the ASR is less than 90%(using resnet18_trigger.npy). But if i added generating surrogate model and trigger blocks to run in the code (actually, I don't need these blocks at all, I just wanted to see the attack results by using testing block), the ASR can increase to nearly 100%. Have you ever noticed this problem？Or did I do something wrong？ I'm looking forward to your answer, thank you!

pmzzs commented 1 year ago

I think this may be due to the randomness of the optimization process, I hope future work can solve this problem.

vivien319 commented 1 year ago

Thank you for your reply！