bboylyg / NAD

This is an implementation demo of the ICLR 2021 paper [Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks](https://openreview.net/pdf?id=9l0K4OM-oXE) in PyTorch.
119 stars 13 forks source link

A few question #9

Closed zhaitongqing233 closed 2 years ago

zhaitongqing233 commented 2 years ago

Hello. I am looking for the possible solution for backdoor attack. I've read the interesting and promising research, but still in confusion.

  1. Why distillation with pruned model as teacher can purify the poisoned model, do you have more detailed insights?
  2. Have you have tried bigger model and dataset?
  3. There is an attack against the pruning-defense(through pruning in the training period, however unrealistic in real world), what do you think of such attackers which are specially designed for pruning.

Looking for your reply.

bboylyg commented 2 years ago

Hi, thanks for your interest in our work. The response to your questions are as follows:

  1. Firstly, we would like to point out that the teacher model used in NAD is not a pruned model(described in your question), but it is a backdoored model after fine-tuning (See the Figure 1 in our paper). Actually, the cause of effectiveness for NAD is mainly due to the regularization and integration of attention maps. We have provided both intuitive analysis (see section 4.3) and experimental results that compare the defense effect on feature maps and attention maps (see Table 8), as well as the comparison of feature visualization between the different functions of attention operations (see Figure 11). We also believe a depth-reading to the whole content of our paper would benefit to your understanding of NAD.
  2. We include a variety of combinations of model architecture in WRN model in Table 2. For the other dataset please check the results in our newly published paper on ABL (which also includes more specific results for NAD).
  3. In my opinion, the pruning-based defense performs promising defense results as shown in the paper ANP. As such, It is still an open topic to design an effective attack efficiently against pruning-based defense.

Hope this response are helpful to your research.