VGMShield: Mitigating Misuse of Video Generative Models

This repository contains code for fake video detection, fake video source tracing, and misuse prevention tasks. We have proposed the first pipeline against the misuse and unsafe concern for video diffusion models.

This repository contains:

Codes for trained detection and source tracing models with given data.
Introduce how to evaluate detection source tracing models on given/custom data.
Approaches to use our misuse prevention strategy.

📄 Table of Contents

📄 Table of Contents
🛠️ Download Dependencies
- Video generation models dependencies
- Detection and Source tracing model dependencies
  - I3D dependencies
  - X-CLIP and VideoMAE dependencies
🚀 Model Training
👀 Model Evaluation
💪 Misuse Prevention
🖊️ Citation
🥰 Acknowledgement

🛠️ Download Dependencies

Video generation models dependencies

Our experiments include nine different generative tasks for each generation model environment. Please refer to their repository respectively: Hotshot-xl I2Vgen-xl Show-1 Videocrafter SEINE LaVie Stable Video Diffusion .

Detection and Source tracing model dependencies

In this part, you can set up the environment for detection and source tracing models.

I3D dependencies

Install environment_i3d.yml and run:

conda env create -f environment_i3d.yml

X-CLIP and VideoMAE dependencies

Install environment_mae.yml and run:

conda env create -f environment_mae.yml

🚀 Model Training

This part provides instructions on how to train different backbone detection and source tracing models.

First, enter detection and source tracing directory

cd direction_and_source_tracing

Note: The default setting for source tracing is the nine generation tasks as we mentioned in our paper. Please change the code for your own tasks.

Training I3D-based detection model

python i3d.py --train True --epoch 20 --learning_rate 1e-5 --save_checkpoint_dir ./save.pt \
    --task "detection" \
    --pre_trained_I3D_model ../models/rgb_imagenet.pt --fake_videos_path "fake videos' path" \
    --real_videos_path "real videos' path" --label_number 2

Training I3D-based source tracing model

python i3d.py --train True --epoch 20 --learning_rate 1e-5 --save_checkpoint_dir ./save.pt \
    --task "source_tracing" \
    --pre_trained_I3D_model ../models/rgb_imagenet.pt --fake_videos_path \
    "fake videos generated from model 1" \
    ...
    "fake videos generated from model 9" --label_number 9

Develop the detection and source tracing model using VideoMAE as the backbone.

Training MAE-based detection model

python mae.py --train True --epoch 20 --learning_rate 1e-5 --save_checkpoint_dir ./save.pt \
    --task "detection" \
    --fake_videos_path "fake videos' path" \
    --real_videos_path "real videos' path" --label_number 2

Training MAE-based source tracing model

python mae.py --train True --epoch 20 --learning_rate 1e-5 --save_checkpoint_dir ./save.pt \
    --task "source_tracing" \
    --fake_videos_path \
    "fake videos generated from model 1" \
    ...
    "fake videos generated from model 9" --label_number 9

Build the detection and source tracing model using XCLIP.

Training XCLIP-based detection model

python xclip.py --train True --epoch 20 --learning_rate 1e-5 --save_checkpoint_dir ./save.pt \
    --task "detection" \
    --fake_videos_path "fake videos' path" \
    --real_videos_path "real videos' path" --label_number 2

Training XCLIP-based source tracing model

python xclip.py --train True --epoch 20 --learning_rate 1e-5 --save_checkpoint_dir ./save.pt \
    --task "source_tracing" \
    --fake_videos_path \
    "fake videos generated from model 1" \
    ...
    "fake videos generated from model 9" --label_number 9

👀 Model Evaluation

After training the detection and source tracing model, we can test our model's performance here.

We have provided pre-trained detection and source tracing checkpoints at our 🤗 huggingface repository. Please feel free to use it.

Testing I3D-based detection model

python i3d.py --train False --task "detection" \
    --load_pre_trained_model_state "Your pre-trained model's path" --fake_videos_path \
    "fake video path" \
    --real_videos_path "real video path" --label_number 2

Testing I3D-based source tracing model

python i3d.py --train False --task "source_tracing" \
    --load_pre_trained_model_state "Your pre-trained model's path" --fake_videos_path \
    "fake videos generated from model 1" \
    ...
    "fake videos generated from model 9" --label_number 9

Testing MAE-based detection model

python mae.py --train False --task "detection" \
    --load_pre_trained_model_state "Your pre-trained model's path" --fake_videos_path \
    "fake video path" \
    --real_videos_path "real video path" --label_number 2

Testing MAE-based source tracing model

python mae.py --train False --task "source_tracing" \
    --load_pre_trained_model_state "Your pre-trained model's path" --fake_videos_path \
    "fake videos generated from model 1" \
    ...
    "fake videos generated from model 9" --label_number 9

Testing XCLIP-based detection model

python xclip.py --train False --task "detection" \
    --load_pre_trained_model_state "Your pre-trained model's path" --fake_videos_path \
    "fake video path" \
    --real_videos_path "real video path" --label_number 2

Testing XCLIP-based source tracing model

python xclip.py --train False --task "source_tracing" \
    --load_pre_trained_model_state "Your pre-trained model's path" --fake_videos_path \
    "fake videos generated from model 1" \
    ...
    "fake videos generated from model 9" --label_number 9

💪 Misuse Prevention

Note: In our paper, we use the encoders from Stable Video Diffusion to add perturbation. The environment for our defense methods is the same as Stable Video Diffusion.

We provided two defense strategies, which are directed defense and undirected defense. To execute the directed defense approach, run:

python misuse_prevention.py --input_path original_image --tar_img_path target_image --steps iteration_steps --eps 4/255

For undirected defense, run:

python misuse_prevention.py --input_path original_image --directed False --steps iteration_steps --eps 4/255

🖊️ Citation

@misc{pang2024vgmshield,
      title={VGMShield: Mitigating Misuse of Video Generative Models}, 
      author={Yan Pang and Yang Zhang and Tianhao Wang},
      year={2024},
      eprint={2402.13126},
      archivePrefix={arXiv},
      primaryClass={cs.CR}
}

🥰 Acknowledgement

We feel gratitude for the previous open-source work that helped us construct our VGMShield. These works include but are not limited to Video Features, VideoX,Hotshot-xl, I2Vgen-xl, Show-1, Videocrafter, SEINE, LaVie, and Stable Video Diffusion. We respect their effort and original contributions.

py85252876 / MMVGM

readme