AAAAAAsuka / Impress

code of paper "IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI"
21 stars 6 forks source link

Impress

This is the official repository for "IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI", the paper has been accepted by NeurIPS 2023

Ethical Statement

We oppose the unauthorized imitation of others' works. Our concern stems from the possibility that image protection technologies might make it easier for individuals to post their artworks or personal photos online. However, if the protective noise on these artworks or photos can be easily removed, it could lead to malicious users more readily acquiring data, thereby exacerbating such abuses. We must confront this potential risk by researching possible removal methods to encourage the development of better protection technologies.

Environment Setup

First, our code requires the environment listed in requirements.txt to be installed:

pip install -r requirements.txt

Generate Experiment Data

We have conducted experiments with the Glaze and Photoguard methods on subsets of the wikiart dataset and the Helenface dataset respectively. Below is how to generate the protected target images used in the experiment.

For the Glaze method, the wikiart dataset and wikiart csv files needs to be downloaded first. After the dataset is downloaded, generate the experiment data using the following command:

python wikiart_preprocessing.py --wikiart_dir=your_wikiart_dir --exp_data_dir=your_exp_data_dir

The generated experimental data will be stored in the your_exp_data_dir/${artist}/clean/train/ directory. Where, artist is the author of the selected artwork.

For the Photoguard method, the experimental data used can be downloaded through this link.

Glaze

Quick Start

For the Glaze method, to quickly start our experiment, please execute the following command:

bash scripts/new/test_all.sh

Next, we will introduce each step in detail.

Adding Protective Noise to Images

For the Glaze method, it is first necessary to generate the style-transferred protected images:

python style_transfer.py --exp_data_dir=your_exp_data_dir --artist=artist

Then, execute the following code to add protective noise to the images:

python glaze_origin.py --clean_data_dir=[exp_data_dir]/${artist}/clean/train/ \
                        --trans_data_dir=[exp_data_dir]/preprocessed_data/${artist}/trans/train/transNum24_seed0 \
                        --p=${glaze_p} \
                        --alpha=${glaze_alpha} \
                        --glaze_iters=${glaze_iters} \
                        --lr=${glaze_lr} \
                        --device=${device}

Below is the explanation of input hyperparameters:

Execute Impress

For data protected by Glaze, to use Impress to remove protective noise, execute the following code:

python glaze_pur.py --clean_data_dir=[exp_data_dir]/${artist}/clean/train/ \
                    --trans_data_dir=[exp_data_dir]/${artist}/trans/train/transNum24_seed0 \
                    --pur_eps=${pur_eps} \
                    --pur_lr=${pur_lr} \
                    --pur_iters=${pur_iters} \
                    --pur_alpha=${pur_alpha} \
                    --pur_noise=${pur_noise} \
                    --device=${device} \
                    --adv_para=${adv_para} \
                    --pur_para=${pur_para}

Below is the explanation of input hyperparameters:

Finetune Stable Diffusion Model

After generating the initial target images, images protected by Glaze, and images purified by Impress, we need to use these images to finetune the Stable Diffusion model separately:

accelerate launch train_text_to_image.py \
  --pretrained_model_name_or_path='stabilityai/stable-diffusion-2-1-base' \
  --train_data_dir=${TRAIN_DIR} \
  --use_ema \
  --resolution=512 --center_crop --random_flip \
  --train_batch_size=${batch_size} \
  --gradient_accumulation_steps=${grad_accum} \
  --gradient_checkpointing \
  --mixed_precision="fp16" \
  --max_train_steps=${step} \
  --learning_rate=5e-6 \
  --max_grad_norm=1 \
  --lr_scheduler="constant" --lr_warmup_steps=0 \
  --output_dir=${OUTPUT_DIR} \
  --enable_xformers_memory_efficient_attention

Where, TRAIN_DIR is the storage path of images used as the finetune dataset. For explanations of other parameters, please see https://huggingface.co/docs/diffusers/v0.13.0/en/training/text2image.

Using Stable Diffusion to Generate Images

For Glaze, we need to use the model finetuned in the previous step to generate images. Please execute:

python glaze_test.py \
    --test_data_dir="../wikiart/preprocessed_data/${artist}/clean/test/" \
    --save_dir=your_generate_images_save_dir \
    --checkpoint=your_finetuned_SDmodel_savedir\
    --diff_steps=100 \
    --device=${test_device} 

Evaluation

For Glaze, to calculate evaluate metrics, please execute:

#clip classifier 
python clip_classifier.py \
       --all_artists="${all_artists}" \
       --adv_para=${adv_para} \
       --pur_para=${pur_para} \
       --ft_step=${step} \
       --trans_num=24 \
       --manual_seed=0

## diffusion classifier
python diffusion-classifier/eval_prob_adaptive.py \
       --artist="${artist}" \
       --test_data="${test_data}" \
       --adv_para=${adv_dir} \
       --pur_para=${pur_dir} \
       --ft_step=${step} \
       --trans_num=24 \
       --device="${device}" \
       --manual_seed=0

Where, adv_para and pur_para are the same as the inputs when executing Impress, step is the step number when finetuning the model. For the CLIP classifier, all_artists represents all artists to be tested, required to be entered as a string, and separated by spaces. For the Diffusion classifier, test_data represents the type of data to be tested, it can be clean, adv, or pur.

Photoguard

Quick Start

For the Photoguard method, to quickly start our experiment, please execute the following command:

bash scripts/new/pg_mask_diff_test.sh

Next, we will introduce each step in detail.

Adding Protective Noise to Images

Protective noise can be added to images using the Photoguard method by executing the following code:

python pg_mask_diff_helen.py \
        --attack_type=$attack_type \
        --pg_eps=$pg_eps \
        --pg_step_size=$pg_step_size \
        --pg_iters=$pg_iters 

Below is an explanation of the input hyperparameters:

Execute Impress

For data protected by photoguard, to use Impress to remove protective noise, please execute the following code:

python pg_mask_pur_helen.py \
        --attack_type=$attack_type \
        --pg_eps=$pg_eps \
        --pg_step_size=$pg_step_size \
        --pg_iters=$pg_iters \
        --device="cuda:0" \
        --pur_eps=$pur_eps \
        --pur_iters=$pur_iters \
        --pur_lr=$pur_lr \
        --pur_alpha=$pur_alpha 

Below is an explanation of the input hyperparameters:

attack_type, pg_eps, pg_step_size, and pg_iters have the same meanings as when adding protective noise.

Editing Images using Stable Diffusion

To attempt editing the original images, images protected by Photoguard, and images purified by Impress, please execute:

python pg_generate.py \
        --attack_type=$attack_type \
        --pg_eps=$pg_eps \
        --pg_step_size=$pg_step_size \
        --pg_iters=$pg_iters \
        --pur_eps=$pur_eps \
        --pur_iters=$pur_iters \
        --pur_lr=$pur_lr \
        --pur_alpha=$pur_alpha \
        --prompt="${prompt}"

Where prompt represents the content of the images generated after editing, default is "a person in an airplane", and the meanings of the other hyperparameters remain the same as previously described.

Evaluation

For Photoguard, to calculate test metrics, please execute:

python pg_metric.py \
        --attack_type=$attack_type \
        --pg_eps=$pg_eps \
        --pg_step_size=$pg_step_size \
        --pg_iters=$pg_iters \
        --pur_eps=$pur_eps \
        --pur_iters=$pur_iters \
        --pur_lr=$pur_lr \
        --pur_alpha=$pur_alpha \
        --prompt="${prompt}"

The meanings of all hyperparameters remain the same as previously described.

Common Questions

Discussion on the Protection Effectiveness of Glaze

In our experiments, we observed that the Glaze method we reproduced often requires substantial modifications to the clean image to achieve effective protection. Here are some examples:

glaze example 1 glaze example 2

We found that reducing the perturbation budget in the LPIPS loss term does not significantly lessen the alterations Glaze makes to the original image. This might be due to the fact that the perturbation budget in LPIPS is not a hard margin like $L_{inf}$. As a result, during multiple rounds of optimization, Glaze might overfit the LPIPS, leading to a LPIPS loss term value lower than the set perturbation budget, while the image content undergoes significant changes.

It is important to note that the substantial modifications Glaze makes to clean images actually present a more challenging setting for us. This means that images protected by Glaze lose more original information and contain stronger protective noise. An extreme case would be if an image protected by Glaze becomes entirely different from the clean image (e.g., changing from a Monet painting to a Picasso painting), making it impossible to recover the clean image without prior knowledge.

Fine-Tuning Settings

In our experiments related to Glaze, we fully fine-tuned the Stable Diffusion model using clean images, protected images, and purified images. We observed that excessive fine-tuning rounds can lead to model overfitting, resulting in poor image generation quality. Here is an example(fine tune with clean image):

finetune example

If you notice significantly poor image quality, consider reducing the number of fine-tuning steps.

Additionally, it's crucial to maintain a batch size of 32 when fine-tuning the SD model. The actual batch size depends on the per-GPU batch size, gradient accumulation steps, and the number of GPUs used. For instance, in our experiment, we used 4 GPUs, with a per-GPU batch size of 8, and a gradient accumulation of 1, achieving an actual batch size of 4 8 1 = 32. If you use 2 GPUs with a per-GPU batch size of 8, you should adjust the gradient accumulation to 2 to keep a consistent actual batch size of 32 (2 8 2).

Acknowledgements

We are grateful to the Glaze team for their communication and for reviewing our code.

Bibtex

If these codes have been helpful to you, welcome to cite our paper! The bibtex of our paper is:

@inproceedings{
cao2023impress,
title={IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI},
author={Bochuan Cao, Changjiang Li, Ting Wang, Jinyuan Jia, Bo Li, Jinghui Chen},
booktitle={The 37th Conference on Neural Information Processing Systems (NeurIPS), New Orleans, Louisiana, USA.},
year={2023},
url={https://arxiv.org/abs/2310.19248}
}