Detecting Deepfakes with Self-Blended Images

Overview
The official PyTorch implementation for the following paper:

Detecting Deepfakes with Self-Blended Images,
Kaede Shiohara and Toshihiko Yamasaki,
CVPR 2022 Oral

License

Our code and pretrained model are freely available for research purpose.
For commercial use:

A license agreement is required.
See the license for more details and contact the author Kaede Shiohara.

Changelog

3.2.2023: Fixed bug in preprocessing code. We recommend that those who have any problems in reproducing the experimental results try again from the preprocessing.

13.9.2022: Added an inference code for FF++

10.9.2022: Added a weight trained on c23 of FF++

19.5.2022: Released training/inference code and a pretrained weight.

19.4.2022: Pre-released this repository

Recomended Development Environment

GPU: NVIDIA A100
CUDA: 11.1
Docker: 20.10.8

Setup

1. Dataset

Download datasets and place them in ./data/ folder.
For example, download Celeb-DF-v2 and place it:

.
└── data
    └── Celeb-DF-v2
        ├── Celeb-real
        │   └── videos
        │       └── *.mp4
        ├── Celeb-synthesis
        │   └── videos
        │       └── *.mp4
        ├── Youtube-real
        │   └── videos
        │       └── *.mp4
        └── List_of_testing_videos.txt

For other datasets, please refer to ./data/datasets.md .

2. Pretrained model

We provide weights of EfficientNet-B4 trained on SBIs from FF-raw and FF-c23.
Download [raw][c23] and place it in ./weights/ folder.

3. Docker

Replace the absolute path to this repository in ./exec.sh .
Run the scripts:
```
bash build.sh
bash exec.sh
```

Test

For example, run the inference on Celeb-DF-v2:

CUDA_VISIBLE_DEVICES=* python3 src/inference/inference_dataset.py \
-w weights/FFraw.tar \
-d CDF

The result will be displayed.

Using the provided pretrained model, our cross-dataset results are reproduced as follows:

Training Data	CDF	DFD	DFDC	DFDCP	FFIW
FF-raw	93.82%	97.87%	73.01%	85.70%	84.52%
FF-c23	92.87%	98.16%	71.96%	85.51%	83.22%

We also provide an inference code for video:

CUDA_VISIBLE_DEVICES=* python3 src/inference/inference_video.py \
-w weights/FFraw.tar \
-i /path/to/video.mp4

and for image:

CUDA_VISIBLE_DEVICES=* python3 src/inference/inference_image.py \
-w weights/FFraw.tar \
-i /path/to/image.png

Training

Download FF++ real videos and place them in ./data/ folder:

.
└── data
└── FaceForensics++
    ├── original_sequences
    │   └── youtube
    │       └── raw
    │           └── videos
    │               └── *.mp4
    ├── train.json
    ├── val.json
    └── test.json

Download landmark detector (shape_predictor_81_face_landmarks.dat) from here and place it in ./src/preprocess/ folder.

Run the two codes to extractvideo frames, landmarks, and bounding boxes:

python3 src/preprocess/crop_dlib_ff.py -d Original
CUDA_VISIBLE_DEVICES=* python3 src/preprocess/crop_retina_ff.py -d Original

(Option) You can download code for landmark augmentation:
```
mkdir src/utils/library
git clone https://github.com/AlgoHunt/Face-Xray.git src/utils/library
```
Even if you do not download it, our training code works without any error. (The performance of trained model is expected to be lower than with it.)
Run the training:
```
CUDA_VISIBLE_DEVICES=* python3 src/train_sbi.py \
src/configs/sbi/base.json \
-n sbi
```
Top five checkpoints will be saved in ./output/ folder. As described in our paper, we use the latest one for evaluations.

Citation

If you find our work useful for your research, please consider citing our paper:

@inproceedings{shiohara2022detecting,
  title={Detecting Deepfakes with Self-Blended Images},
  author={Shiohara, Kaede and Yamasaki, Toshihiko},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={18720--18729},
  year={2022}
}

mapooon / SelfBlendedImages

readme