davide-coccomini / Combining-EfficientNet-and-Vision-Transformers-for-Video-Deepfake-Detection

Code for Video Deepfake Detection model from "Combining EfficientNet and Vision Transformers for Video Deepfake Detection" presented at ICIAP 2021.
https://dl.acm.org/doi/abs/10.1007/978-3-031-06433-3_19
MIT License
239 stars 60 forks source link

About datapropress #45

Open LonelyPlanetIoT opened 1 year ago

LonelyPlanetIoT commented 1 year ago

I have a confusing about dataprocess. I find that you resize the frame in face_detector.py in line68 when you create VideoDatasset. Will it decrease the final result? And the resize operator it is neccessory or not if I want to use dataprocess like this for other model? image

davide-coccomini commented 1 year ago

This part has been hugely inspired by Selim Seferbekov work: https://github.com/selimsef/dfdc_deepfake_challenge/blob/master/preprocessing/face_detector.py We maintained the frame resize to be totally comparable, anyway, we think that it would not affect the result. Maybe you can try to remove it and see what happens.

LonelyPlanetIoT commented 1 year ago

Thanks a lot

LonelyPlanetIoT commented 1 year ago

I try to cancel the resize operator and a error come like the follow picture. Maybe if you don't resize the computer doesn't have enough resource to execute the code? image

davide-coccomini commented 1 year ago

Can you provide the full stack trace?

LonelyPlanetIoT commented 1 year ago

Sorry for replying so late. In the end ,I didn't change it. But I have another question. I use MTCNN to extract face.The thresholds is set as same as yours. In this situation, I find some videos such as NT/808_829.mp4 can not extract enough faces as the paper mentioned. And I try to decrease the thresholds. But some frames are not face. What should I do?