bboylyg / RNP

Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023)
28 stars 3 forks source link

Reconstructive Neuron Pruning for Backdoor Defense

Code for ICML 2023 Paper "Reconstructive Neuron Pruning for Backdoor Defense"


Quick Start: RNP against BadNets Attack

By default, we only use 500 defense data randomly sampled from the training set to perform the unlearn-recover process and optimize the pruning mask. To check the performance of RNP on a Badnets ResNet-18 network (i.e. 10% poisoning rata with ResNet-18 on CIFAR-10), you can directly run the command like:

python main.py

Experimental Results on BadNets Attack

[2023/07/09 22:21:00] - Namespace(alpha=0.2, arch='resnet18', backdoor_model_path='weights/ResNet18-ResNet-BadNets-target0-portion0.1-epoch80.tar', batch_size=128, clean_threshold=0.2, cuda=1, dataset='CIFAR10', log_root='logs/', mask_file=None, momentum=0.9, num_class=10, output_weight='weights/', pruning_by='threshold', pruning_max=0.9, pruning_step=0.05, ratio=0.01, recovering_epochs=20, recovering_lr=0.2, save_every=5, schedule=[10, 20], target_label=0, target_type='all2one', trig_h=3, trig_w=3, trigger_type='gridTrigger', unlearned_model_path=None, unlearning_epochs=20, unlearning_lr=0.01, weight_decay=0.0005)
[2023/07/09 22:21:00] - ----------- Data Initialization --------------
[2023/07/09 22:21:03] - ----------- Backdoor Model Initialization --------------
[2023/07/09 22:21:04] - Epoch    lr      Time    TrainLoss   TrainACC    PoisonLoss      PoisonACC   CleanLoss   CleanACC
[2023/07/09 22:21:04] - ----------- Model Unlearning --------------
[2023/07/09 22:21:15] - 0    0.010   11.2    0.1004      0.9780      0.0000      1.0000      0.2185      0.9342
[2023/07/09 22:21:26] - 1    0.010   10.8    0.1213      0.9760      0.0000      1.0000      0.2253      0.9317
[2023/07/09 22:21:37] - 2    0.010   10.8    0.1400      0.9740      0.0000      1.0000      0.2349      0.9304
[2023/07/09 22:21:48] - 3    0.010   10.8    0.1535      0.9720      0.0000      1.0000      0.2513      0.9266
[2023/07/09 22:21:59] - 4    0.010   10.9    0.2078      0.9640      0.0000      1.0000      0.2770      0.9214
[2023/07/09 22:22:10] - 5    0.010   11.0    0.2614      0.9500      0.0000      1.0000      0.3144      0.9141
[2023/07/09 22:22:21] - 6    0.010   10.8    0.3711      0.9220      0.0000      1.0000      0.3847      0.8991
[2023/07/09 22:22:31] - 7    0.010   10.8    0.4538      0.8700      0.0000      1.0000      0.5276      0.8669
[2023/07/09 22:22:42] - 8    0.010   10.8    0.7916      0.6700      0.0000      1.0000      0.9439      0.7586
[2023/07/09 22:22:53] - 9    0.010   10.8    1.4771      0.4540      0.0000      1.0000      2.5574      0.4016
[2023/07/09 22:23:04] - 10   0.001   10.8    3.1028      0.2920      0.0000      1.0000      2.5620      0.3949
[2023/07/09 22:23:15] - 11   0.001   10.8    4.0416      0.2360      0.0000      1.0000      2.8507      0.3729
[2023/07/09 22:23:25] - 12   0.001   10.8    4.8811      0.1980      0.0000      1.0000      3.2337      0.3368
[2023/07/09 22:23:25] - ----------- Model Recovering --------------
[2023/07/09 22:23:26] - Epoch    lr      Time    TrainLoss   TrainACC    PoisonLoss      PoisonACC   CleanLoss   CleanACC
[2023/07/09 22:23:37] - 1    0.200   11.0    1.0719      0.1980      0.0000      1.0000      2.0311      0.4370
[2023/07/09 22:23:48] - 2    0.200   11.0    0.9892      0.1920      0.0000      1.0000      1.3383      0.5432
[2023/07/09 22:23:59] - 3    0.200   11.0    0.7809      0.2320      0.0018      1.0000      1.1726      0.6138
[2023/07/09 22:24:10] - 4    0.200   11.0    0.4892      0.2500      0.3161      0.8770      1.0972      0.6552
[2023/07/09 22:24:21] - 5    0.200   11.1    0.4130      0.2860      0.6548      0.6051      1.0409      0.6662
[2023/07/09 22:24:32] - 6    0.200   11.0    0.3691      0.3060      0.7871      0.5084      0.9843      0.6822
[2023/07/09 22:24:43] - 7    0.200   11.0    0.3262      0.3460      0.8479      0.4791      0.9089      0.7053
[2023/07/09 22:24:54] - 8    0.200   11.0    0.2963      0.3760      0.8369      0.4904      0.8691      0.7157
[2023/07/09 22:25:05] - 9    0.200   11.0    0.2777      0.3900      0.8192      0.5090      0.8226      0.7324
[2023/07/09 22:25:16] - 10   0.200   11.0    0.2485      0.4320      0.7842      0.5340      0.7765      0.7497
[2023/07/09 22:25:27] - 11   0.200   11.0    0.2337      0.4500      0.7554      0.5562      0.7283      0.7666
[2023/07/09 22:25:38] - 12   0.200   11.0    0.2140      0.4900      0.6922      0.6044      0.7022      0.7752
[2023/07/09 22:25:49] - 13   0.200   11.0    0.2043      0.5140      0.6542      0.6317      0.6718      0.7870
[2023/07/09 22:26:00] - 14   0.200   11.0    0.1807      0.5340      0.6128      0.6598      0.6517      0.7951
[2023/07/09 22:26:11] - 15   0.200   11.0    0.1724      0.5440      0.5820      0.6873      0.6342      0.8018
[2023/07/09 22:26:22] - 16   0.200   11.0    0.1729      0.5780      0.5754      0.6968      0.6084      0.8121
[2023/07/09 22:26:33] - 17   0.200   11.1    0.1532      0.6180      0.5683      0.7027      0.5930      0.8176
[2023/07/09 22:26:44] - 18   0.200   11.0    0.1476      0.6120      0.5614      0.7083      0.5766      0.8244
[2023/07/09 22:26:55] - 19   0.200   11.0    0.1510      0.6380      0.5674      0.7069      0.5601      0.8312
[2023/07/09 22:27:06] - 20   0.200   11.1    0.1439      0.6520      0.5788      0.6988      0.5417      0.8402
[2023/07/09 22:27:07] - ----------- Backdoored Model Pruning --------------
[2023/07/09 22:27:07] - Pruned Number    Layer Name      Neuron Idx      Mask    PoisonLoss      PoisonACC   CleanLoss   CleanACC
[2023/07/09 22:27:17] - 0    None        None        0.0001      1.0000      0.2157      0.9340
[2023/07/09 22:27:27] - 12.00    layer4.1.bn2    188     0.0     0.0122      0.9986      0.2104      0.9347
[2023/07/09 22:27:37] - 14.00    layer4.1.bn2    12      0.05    0.0139      0.9982      0.2092      0.9340
[2023/07/09 22:27:46] - 18.00    layer3.0.bn2    230     0.1     0.0601      0.9876      0.2065      0.9338
[2023/07/09 22:27:56] - 21.00    bn1     5   0.15000000000000002     0.9935      0.4796      0.2074      0.9340
[2023/07/09 22:28:06] - 24.00    layer4.0.bn2    82      0.2     2.8242      0.0661      0.2297      0.9280
[2023/07/09 22:28:16] - 28.00    layer3.0.bn1    106     0.25    3.2791      0.0424      0.2292      0.9270
[2023/07/09 22:28:26] - 32.00    layer4.1.bn2    152     0.30000000000000004     4.2908      0.0172      0.2295      0.9277

Links to ImageNet-12 Subset

Please download the ImageNet-12 subset with this link: (Baidu driver)[https://pan.baidu.com/share/init?surl=LjE6g1cxQ98RZHMWi0tQlA] (pwd: qetk) or (Google driver)[https://drive.google.com/file/d/1yG9ENDUbOIUKY1i5ADu4X_7Lhbvqca2w/view?usp=sharing]

Backdoor Model Weights

You can directly download the pre-trained backdoored model weights with the links below:

Attacks Paper Name Baidu Weight Source (pwd: 1212) Google Weight Source
Badnets Badnets: Evaluating Backdooring Attacks on Deep Neural Networks Baidu Drive Google Drive
Trojan Trojaning attack on Neural Networks Baidu Drive [Google Drive]()
Blend Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning Baidu Drive [Google Drive]()
CL Label-Consistent Backdoor Attacks Baidu Drive Google Drive
SIG A New Backdoor Attack in Cnns by Training Set Corruption without Label Poisoning Baidu Drive [Google Drive]()
Dynamic Input-Aware Dynamic Backdoor Attack Baidu Drive Google Drive
WaNet WaNet - Imperceptible Warping-based Backdoor Attack Baidu Drive [Google Drive]()
FC Poison Frog! Targeted Clean-label Backdoor Attacks on Neural Networks Baidu Drive Google Drive
DFST Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxifcation Baidu Drive [Google Drive]()
AWP Can Adversarial Weight Perturbations Inject Neural Backdoors Baidu Drive [Google Drive]()
LIRA LIRA: Learnable, Imperceptible and Robust Backdoor Attacks Baidu Drive [Google Drive]()
A-Blend Circumventing Backdoor Defense that are Based on Latent Separability Baidu Drive [Google Drive]()

Citation

If you use this code in your work, please cite the accompanying paper:

@inproceedings{
li2023reconstructive,
title={Reconstructive Neuron Pruning for Backdoor Defense},
author={Yige Li and Xixiang Lyu and Xingjun Ma and Nodens Koren and Lingjuan Lyu and Bo Li and Yu-Gang Jiang},
booktitle={ICML},
year={2023},
}

Acknowledgements

As this code is reproduced based on the open-sourced code Adversarial Neuron Pruning Purifies Backdoored Deep Models and Distilling Cognitive Backdoor Patterns within an Image, the authors would like to thank their contribution and help.

Backdoor-related repo: