jeremyxianx / RAWatermark

Official Implementation for: "RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images (Videos) with Provable Guarantees"
MIT License
19 stars 2 forks source link
image-watermark video-watermark watermark

RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images (Videos) with Provable Guarantees

RAW aims to offer a robust and agile watermarking framework that adapts to the rapidly evolving landscape of digital media creation. As deepfakes and other AI-generated content become increasingly sophisticated and prevalent, RAW's ability to embed imperceptible yet detectable watermarks directly into image and video content provides a crucial tool for content authentication and intellectual property protection.

By offering provable guarantees on false-positive rates and resilience against adversarial attacks, we hope RAW paves the way for a future where the authenticity of digital content can be verified.

This repository contains the source codes for the RAWatermark project, based on this paper. Citation of the work:

@inproceedings{xian2024rawrobustagileplugandplay,
  title={RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees},
  author={Xian, Xun and Wang, Ganghua and Bi, Xuan and Srinivasa, Jayanth and Kundu, Ashish and Hong, Mingyi and Ding, Jie},
  booktitle={Advances in Neural Information Processing Systems},
  year={2024},
  url={https://arxiv.org/abs/2403.18774}
}

Overview

This is the official implementation of our paper titled "RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images (Videos) with Provable Guarantees" (pdf). The paper introduces an innovative watermarking scheme that is model-agnostic, imperceptible, and operates with zero-bit capacity. It is designed for watermarking videos and images, and is suitable for deployment in real-time scenarios.

Compared to existing encoder-decoder-based watermarking schemes, such as RivaGan, our proposed method offers:

  1. Tremendously elevated watermark encoding speed (e.g., approximately $40\times$ improved time efficiency for watermarking a 25-frame $512 \times 512$ video), generated by the Stable Video Diffusion;
  2. Supporting (1) arbitrary lengths of videos and (2) tunable strength of watermarking without any extra training.
  3. Provable guarantee on the false-positive rate of the watermark detection under distributional-free assumption (Currently, only for image watermark).

Installation

This repository was developed with PyTorch 2.0.1 and should be compatible with newer versions of PyTorch. To set up the required environment, you should first manually install PyTorch with CUDA, and then run the following command to set up a separate environment and install the required packages (both Conda and Git are required).

conda create --name Raw python=3.10
conda activate Raw

git clone https://github.com/jeremyxianx/RAWatermark.git
cd RAWatermark
pip install -r requirement.txt

A demo example for video watermarking and detection

In the following, we provide a walk-through example to demonstrate how to use the library to embed and detect watermarks in videos.

  1. First, initialize the RAWatermark instance, which contains a (jointly pre-trained) pair of watermark and classifier.

import torch
from scripts import raw, tools

# Setup device, can be cpu or cuda
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

RAW = raw.RAWatermark(device = device, wm_index = 0)
  1. Next, load the video and encode the watermark into each frame of the video. Currently, we only support videos with resolution $512 \times 512$.

demo_videodataset = tools.VideoDataset(
        root_dir = 'assets/video_examples/', # replace to your own video folder
        crop_size=False, # make sure the shape of your video is 512x512
        no_of_frames=25, # replace to your desired number of frames to be watermarked
    )

demo_video1 = demo_videodataset[0].to(device)

wm_demo_video1 = RAW.encode(demo_video1, injection_every_k_frames=1)
  1. Then, we check for the presence of the watermark given the decision_thres.

RAW.detect(wm_demo_video1, decision_thres=0.5)

APIs

We provide several more use cases in the APIs pages. These cases include:

  1. Adjusting the strength of the watermark;
  2. Watermarking images;
  3. Obtaining a provable guarantee on the false-positive rate of the image watermark detection;

Test benchmarks for video watermarking and detection

We test the trained watermark and its associated classifier (trained on the MS-COCO dataset) on short videos generated by the stable-video-diffusion-img2vid-xt model, which is an image-to-video model. For the images used to generate the video, we utilize the DiffusionDB dataset. This ensures that the testing videos are not seen by the watermark and the classifier during training.

Visual Example

Original Watermarked Pixel-wise Difference ($\times 6$)

Encoding Speed (CPU Only)

Video Resolution Number of Frames Method Time Elapsed
$512 \times 512$ 24 RAW (Ours) 0.2 - 0.5s
$512 \times 512$ 24 RivaGan 8-12s

AUROC (over fresh 500 test samples) for Video Watermark Detection

Method AUROC
RAW (Ours) 0.96
RivaGan 0.97

To-Do list

How to Contribute

We welcome contributions from everyone. Please read our CONTRIBUTING.md file for guidelines on how to contribute to this project.

Code of Conduct

To ensure a welcoming and productive environment, all participants are expected to uphold our Code of Conduct.

Contact

If you have any questions, please feel free to contact us at xian0044@umn.edu or submit an issue.