This is the official repository of Video Face Manipulation Detection Through Ensemble of CNNs, presented at ICPR2020 and currently available on IEEExplore and arXiv. If you use this repository for your research, please consider citing our paper. Refer to How to cite section to get the correct entry for your bibliography.
We participated as the ISPL team in the Kaggle Deepfake Detection Challenge. With this implementation, we reached the 41st position over 2116 teams (top 2%) on the private leaderboard.
This repository is currently under maintenance, if you are experiencing any problems, please open an issue.
icpr2020
environment with environment.yml
$ conda env create -f environment.yml
$ conda activate icpr2020
If you just want to test the pre-trained models against your own videos or images:
You need to preprocess the datasets in order to index all the samples and extract faces. Just run the script make_dataset.sh
$ ./scripts/make_dataset.sh
Please note that we use only 32 frames per video. You can easily tweak this parameter in extract_faces.py
Also, please note that for the DFDC we have resorted to the training split exclusively!
In scripts/make_dataset.sh
the value of DFDC_SRC
should point to the directory containing the DFDC train split.
Altough we did not use this dataset in the paper, we provide a script index_celebdf.py to index the videos similarly to
DFDC and FF++. Once you have the index, you can proceed with the pipeline starting from extract_faces.py. You can also use the
split celebdf
during training/testing.
In train_all.sh you can find a comprehensive list of all the commands to train the models presented in the paper. Please refer to the comments in the script for hints on their usage.
If you want to train some models without lunching the script:
python train_triplet.py --net EfficientNetB4 --otherparams
;python train_binclass.py --net EfficientNetB4ST --init path/to/EfficientNetB4/weights/trained/with/train_triplet/weights.pth --otherparams
In test_all.sh you can find a comprehensive list of all the commands for testing the models presented in the paper.
We also provide pretrained weights for all the architectures presented in the paper.
Please refer to this Dropbox link.
Each directory is named $NETWORK_$DATASET
where $NETWORK
is the architecture name and $DATASET
is the training dataset.
In each directory, you can find bestval.pth
which are the best network weights according to the validation set.
Additionally, you can find Jupyter notebooks for results computations in the notebook folder.
Plain text:
N. Bonettini, E. D. Cannas, S. Mandelli, L. Bondi, P. Bestagini and S. Tubaro, "Video Face Manipulation Detection Through Ensemble of CNNs," 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 5012-5019, doi: 10.1109/ICPR48806.2021.9412711.
Bibtex:
@INPROCEEDINGS{9412711,
author={Bonettini, Nicolò and Cannas, Edoardo Daniele and Mandelli, Sara and Bondi, Luca and Bestagini, Paolo and Tubaro, Stefano},
booktitle={2020 25th International Conference on Pattern Recognition (ICPR)},
title={Video Face Manipulation Detection Through Ensemble of CNNs},
year={2021},
volume={},
number={},
pages={5012-5019},
doi={10.1109/ICPR48806.2021.9412711}}
Image and Sound Processing Lab - Politecnico di Milano