VinAIResearch / Warping-based_Backdoor_Attack-release

WaNet - Imperceptible Warping-based Backdoor Attack (ICLR 2021)
GNU Affero General Public License v3.0
111 stars 17 forks source link
backdoor-attacks computer-vision deep-learning deep-learning-security iclr2021 machine-learning security
Table of contents
  1. Introduction
  2. Requirements
  3. Training
  4. Evaluation

WaNet - Imperceptible Warping-based Backdoor Attack

Wanet is a brand-new backdoor attack method that relies on distorting the global structure of images to craft backdoor samples, instead of patching or water-marking images as previous backdoor attack approaches.

This is an official implementation of the ICLR 2021 Paper WaNet - Imperceptible Warping-based Backdoor Attack in Pytorch. This repository includes:

If you find this repo useful for your research, please consider citing our paper

@inproceedings{
nguyen2021wanet,
title={WaNet - Imperceptible Warping-based Backdoor Attack},
author={Tuan Anh Nguyen and Anh Tuan Tran},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=eEn8KTtJOx}
}

Requirements

Training

Run command

$ python train.py --dataset <datasetName> --attack_mode <attackMode>

where the parameters are the following:

The trained checkpoints should be saved at the path checkpoints\<datasetName>\<datasetName>_<attackMode>_morph.pth.tar.

Pretrained models

We also provide pretrained checkpoints used in the original paper. The checkpoints could be found at here. Just download and decompress it in this project's repo for evaluating.

Evaluation

For evaluating trained models, run command

$ python eval.py --dataset <datasetName> --attack_mode <attackMode>

This command will print the model accuracies on three tests: clean, attack, noise test. The clean and attack accuracies should be the same as reported in our paper, while noise one maybe slightly different due to random nosie generating.

Results

Dataset Clean test Attack test Noise test
MNIST 99.52 99.86 98.20
CIFAR-10 94.15 99.55 93.55
GTSRB 98.87 99.33 98.01
CelebA 78.99 99.33 76.74

Defense experiments

Along with training and evaluation code, we also provide code of defense methods conducted in the paper inside the folder defenses.

$ cd defenses/fine_pruning
$ python fine-pruning-mnist.py --dataset mnist --attack_mode <attackMode> 
$ python fine-pruning-cifar10-gtsrb.py --dataset cifar10 --attack_mode <attackMode> 
$ python fine-pruning-cifar10-gtsrb.py --dataset gtsrb --attack_mode <attackMode> 
$ python fine-pruning-celeba.py --dataset celeba --attack_mode <attackMode> 

The result will be printed on screen, and all entropy values are logged in `results` folder.

## Contacts

If you have any questions, drop an email to _v.anhtt152@vinai.io_ , _v.anhnt479@vinai.io_  or leave a message below with GitHub (log-in is needed).