kabkabm / defensegan

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models (published in ICLR2018)
Apache License 2.0
229 stars 62 forks source link
adversarial-attacks adversarial-examples deep-learning generative-adversarial-network iclr2018

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models

Pouya Samangouei, Maya Kabkab, Rama Chellappa

[*: authors contributed equally]

This repository contains the implementation of our ICLR-18 paper: Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models

If you find this code or the paper useful, please consider citing:

@inproceedings{defensegan,
  title={Defense-GAN: Protecting classifiers against adversarial attacks using generative models},
  author={Samangouei, Pouya and Kabkab, Maya and Chellappa, Rama},
  booktitle={International Conference on Learning Representations},
  year={2018}
}

alt text alt text

Contents

  1. Installation
  2. Usage

Installation

  1. Clone this repository:

    git clone --recursive https://github.com/kabkabm/defensegan
    cd defensegan
    git submodule update --init --recursive
  2. Install requirements:

    pip install -r requirements.txt

    Note: if you don't have a GPU install the cpu version of TensorFlow 1.7.

  3. Download the dataset and prepare data directory:

    python download_dataset.py [mnist|f-mnist|celeba]
  4. Create or link output and debug directories:

    mkdir output
    mkdir debug

    or

    ln -s <path-to-output> output
    ln -s <path-to-debug> debug

Usage

Train a GAN model

python train.py --cfg <path> --is_train <extra-args>

The training will create a directory in the output directory per experiment with the same name as to save the model checkpoints. If <extra-args> are different from the ones that are defined in <config>, the output directory name will reflect the difference.

A config file is saved into each experiment directory so that they can be loaded if <path> is the address to that directory.

Example

After running

python train.py --cfg experiments/cfgs/gans/mnist.yml --is_train

output/gans/mnist will be created.

[optional] Save reconstructions and datasets into cache:

python train.py --cfg experiments/cfgs/<config> --save_recs
python train.py --cfg experiments/cfgs/<config> --save_ds

Example

After running the training code for mnist, the reconstructions and the dataset can be saved with:

python train.py --cfg output/gans/mnist --save_recs
python train.py --cfg output/gans/mnist --save_ds

As training goes on, sample outputs of the generator are written to debug/gans/<model_config>.

Black-box attacks

To perform black-box experiments run blackbox.py [Table 1 and 2 of the paper]:

python blackbox.py --cfg <path> \
    --results_dir <results_path> \
    --bb_model {A, B, C, D, E} \
    --sub_model {A, B, C, D, E} \
    --fgsm_eps <epsilon> \
    --defense_type {none|defense_gan|adv_tr}
    [--train_on_recs or --online_training]
    <optional-arguments>

Example

White-box attacks

To test Defense-GAN for white-box attacks run whitebox.py [Tables 4, 5, 12 of the paper]:

python whitebox.py --cfg <path> \
       --results_dir <results-dir> \
       --attack_type {fgsm, rand_fgsm, cw} \
       --defense_type {none|defense_gan|adv_tr} \
       --model {A, B, C, D} \
       [--train_on_recs or --online_training]
       <optional-arguments>

Example

First row of Table 4:

python whitebox.py --cfg <path> \
       --results_dir whitebox \
       --attack_type fgsm \
       --defense_type defense_gan \
       --model A