davide-coccomini / Combining-EfficientNet-and-Vision-Transformers-for-Video-Deepfake-Detection

Code for Video Deepfake Detection model from "Combining EfficientNet and Vision Transformers for Video Deepfake Detection" presented at ICIAP 2021.
https://dl.acm.org/doi/abs/10.1007/978-3-031-06433-3_19
MIT License
237 stars 60 forks source link

test AUC #48

Open zxq99799 opened 1 year ago

zxq99799 commented 1 year ago

Hello, I trained a total of 110000 real images and 100000 fake images using dfdc and ff++, but the final test only achieved an AUC of 0.885. Can you give me some suggestions. Thank you.

davide-coccomini commented 1 year ago

Hi, which version of the model are you using?

zxq99799 commented 1 year ago

efficient-vit This is the model i'm using.

davide-coccomini commented 10 months ago

The EfficientViT obtain an AUC of 0.919 which is not too far from yours. If I understand well, your training set is not complete so it is normal to obtain different results. Anyway, if you want to improve more, I suggest to use Cross Efficient ViT which is our main method.