IBM / BadDiffusion

Official repo to reproduce the paper "How to Backdoor Diffusion Models?" published at CVPR 2023
Apache License 2.0
76 stars 5 forks source link

BadDiffusion

Official repo to reproduce the paper "How to Backdoor Diffusion Models?" published at CVPR 2023

Paper link: https://arxiv.org/abs/2212.05400

Environment

Usage

Install Require Packages and Prepare Essential Data

Please run

bash install.sh

Wandb Logging Support

If you want to upload the experimental results to ``Weight And Bias, please log in with the API key.

wandb login --relogin --cloud <API Key>

Prepare Dataset

Prepare Training Dataset

Prepare FID-Measuring Dataset

Pre-Trained Models

I've uploaded all pre-trained backdoor diffusion models for BadDiffusion and VillanDiffusion on HuggingFace. Please feel free to download backdoored diffusion models from it.

Run BadDiffusion

Arguments

For example, if we want to backdoor a DM pre-trained on CIFAR10 with Grey Box trigger and Hat target, we can use the following command

python baddiffusion.py --project default --mode train+measure --dataset CIFAR10 --batch 128 --epoch 50 --poison_rate 0.1 --trigger BOX_14 --target HAT --ckpt DDPM-CIFAR10-32 --fclip o -o --gpu 0

Training Backdoor Models & Measure the FID and MSE Scores

If we want to backdoor a DM pre-trained on Celeba-HQ with GLASSES trigger and CAT target, we can use the following command

python baddiffusion.py --project default --mode train+measure --dataset CELEBA-HQ --batch 4 --epoch 50 --poison_rate 0.1 --trigger GLASSES --target CAT --ckpt DDPM-CELEBA-HQ-256 --fclip o -o --gpu 0

Measure the FID and MSE Scores

If we want to measure the FID and MSE scores of a DM pre-trained on Celeba-HQ with GLASSES trigger and CAT target, we need to create a new folder measure/CIFAR10 under this repository folder and copy the training images (in .jpg format) of CIFAR10 dataset into this folder. Then, we can use the following command

python baddiffusion.py --project default --mode measure --dataset CELEBA-HQ --eval_max_batch 256 --trigger GLASSES --target CAT --ckpt res_DDPM-CIFAR10-32_CIFAR10_ep50_c1.0_p0.1_BOX_14-HAT --fclip o -o --gpu 0

Generate Samples

If we want to generate the clean samples and backdoor targets from a backdoored DM, use the following command Or simply generate the samples

python baddiffusion.py --project default --mode sampling --ckpt res_DDPM-CIFAR10-32_CIFAR10_ep50_c1.0_p0.1_BOX_14-HAT --fclip o --gpu 0

Run Adversarial Neuron Pruning (ANP)

Arguments

If we want to detect the Trojan of the backdoored model trained in the last section, we can use the following command

python anp_defense.py --project default --epoch 5 --learning_rate 1e-4 --perturb_budget 4.0 --ckpt res_DDPM-CIFAR10-32_CIFAR10_ep50_c1.0_p0.1_BOX_14-HAT --gpu 0