This is the official implementation of Detecting and Recovering Sequential DeepFake Manipulation. We introduce a novel research problem: Detecting Sequential DeepFake Manipulation (Seq-DeepFake), which focus on detecting the sequences of multi-step facial manipulations. To faciliatate the study of Seq-Deepfake, we provide a large-scale Sequential Deepfake Dataset, and propose a concise yet effective Seq-DeepFake Transformer (SeqFakeFormer).
The framework of the proposed method:
git clone https://github.com/rshao/SeqDeepFake.git
cd SeqDeepFake
We recommend using Anaconda to manage the python environment:
conda create -n seqdeepfake python=3.6
conda activate seqdeepfake
conda install -c pytorch pytorch=1.6.0 torchvision=0.7.0 cudatoolkit==10.1.243
conda install pandas
conda install tqdm
conda install pillow
pip install tensorboard==2.4.1
We contribute the first large-scale Sequential DeepFake Dataset, Seq-Deepfake, including ~85k sequentially manipulated face images, each annotated with its ground-truth manipulation sequence.
The images are generated based on the following two different facial manipulation methods, with 28 / 26 types of manipulation sequences (including original), repectively. The lengths of all manipulation sequences range from 1~5.
Here are some sample images and statistics:
Each image in the dataset is annotated with a list of length 5, indicating the ground-truth manipulation sequence. The labels in the sequence are defined as follows:
For Sequential facial components manipulation:
0: 'NA', 1: 'nose', 2: 'eye', 3: 'eyebrow', 4: 'lip', 5: 'hair'
Note: 'NA' means no manipulation is taken in this step.
For Sequential facial attributes manipulation:
0: 'NA', 1: 'Bangs', 2: 'Eyeglasses', 3: 'Beard', 4: 'Smiling', 5: 'Young'
Note: 'NA' means no manipulation is taken in this step.
Note that label 0
serves as the placeholder for sequential manipulations shorter than 5 steps. For example, the annotation for manipulation sequence nose-eye-lip
would be: [1, 2, 4, 0, 0]
. Original images are annotated with [0, 0, 0, 0, 0]
.
You can download the Seq-Deepfake dataset through this link: [Dataset]
After unzip all sub files, the structure of the dataset should be as follows:
./
├── facial_attributes
│ ├── annotations
│ | ├── train.csv
│ | ├── test.csv
│ | └── val.csv
│ └── images
│ ├── train
│ │ ├── Bangs-Eyeglasses-Smiling-Young
│ │ | ├── xxxxxx.jpg
| | | ...
| | | └── xxxxxx.jpg
| | ...
│ │ ├── Young-Smiling-Eyeglasses
│ │ | ├── xxxxxx.jpg
| | | ...
| | | └── xxxxxx.jpg
│ │ └── original
│ │ ├── xxxxxx.jpg
| | ...
| | └── xxxxxx.jpg
│ ├── test
│ │ % the same structure as in train
│ └── val
│ % the same structure as in train
└── facial_components
├── annotations
| ├── train.csv
| ├── test.csv
| └── val.csv
└── images
├── train
│ ├── eyebrow-eye-hair-nose-lip
│ | ├── xxxxxx.jpg
| | ...
| | └── xxxxxx.jpg
| ...
│ ├── nose-eyebrow-lip-eye-hair
│ | ├── xxxxxx.jpg
| | ...
| | └── xxxxxx.jpg
│ └── original
│ ├── xxxxxx.jpg
| ...
| └── xxxxxx.jpg
├── test
│ % the same structure as in train
└── val
% the same structure as in train
Modify train.sh
and run:
sh train.sh
Please refer to the following instructions about some arguments:
Args | Description |
---|---|
CONFIG | Path of the network and optimization configuration file. |
DATA_DIR | Directory to the downloaded dataset. |
DATASET_NAME | Name of the selected manipulation type. Choose from 'facial_components' and 'facial_attributes'. |
RESULTS_DIR | Directory to save logs and checkpoints. |
You can change the network and optimization configurations by adding new configuration files under the directory ./configs/
.
We also provide slurm script that supports multiple GPUs training:
sh train_slurm.sh
where PARTITION
and NODE
should be modified according to your own environment. The number of GPUs to be used can be set through the NUM_GPU
argument.
Modify test.sh
and run:
sh test.sh
For the arguments in test.sh , please refer to the training instructions above, plus the following ones: |
Args | Description |
---|---|---|
TEST_TYPE | The evaluation metrics to use. Choose from 'fixed' and 'adaptive'. | |
LOG_NAME | Should be set according to the log_name of your trained checkpoint to be tested. |
We also provide slurm script for testing:
sh test_slurm.sh
Here we list the performance of three SOTA deepfake detection methods and our method. Please refer to our paper for more details.
Method | Reference | Fixed-Acc ${\uparrow}$ | Adaptive-Acc ${\uparrow}$ |
---|---|---|---|
DRN | Wang et al. | 66.06 | 45.79 |
MA | Zhao et al. | 71.31 | 52.94 |
Two-Stream | Luo et al. | 71.92 | 53.89 |
SeqFakeFormer | Shao et al. | 72.65 | 55.30 |
Method | Reference | Fixed-Acc ${\uparrow}$ | Adaptive-Acc ${\uparrow}$ |
---|---|---|---|
DRN | Wang et al. | 64.42 | 43.20 |
MA | Zhao et al. | 67.58 | 47.48 |
Two-Stream | Luo et al. | 66.77 | 46.38 |
SeqFakeFormer | Shao et al. | 68.86 | 49.63 |
We also provide the pretrained models that generate our results in the benchmark table:
Model | Description |
---|---|
pretrained-r50-c | Trained on facial_components with resnet50 backbone. |
pretrained-r50-a | Trained on facial_attributes with resnet50 backbone. |
In order to try the pre-trained checkpoints, please:
download from the links in the table, unzip the file and put them under the ./results
folder with the following structure:
results
└── resnet50
├── facial_attributes
│ └── pretrained-r50-a
│ └── snapshots
│ ├── best_model_adaptive.pt
│ └── best_model_fixed.pt
└── facial_components
└── pretrained-r50-c
└── snapshots
├── best_model_adaptive.pt
└── best_model_fixed.pt
In test.sh
, modify DATA_DIR
to the root of your Seq-DeepFake dataset. Modify LOGNAME
and DATASET_NAME
to 'pretrained-r50-c'
, 'facial_components'
or 'pretrained-r50-a'
, 'facial_attributes'
, respectively.
Run test.sh
.
If you find this work useful for your research, please kindly cite our paper:
@inproceedings{shao2022seqdeepfake,
title={Detecting and Recovering Sequential DeepFake Manipulation},
author={Shao, Rui and Wu, Tianxing and Liu, Ziwei},
booktitle={European Conference on Computer Vision (ECCV)},
year={2022}
}
[//]: <## Acknowledgements>