davide-coccomini / Combining-EfficientNet-and-Vision-Transformers-for-Video-Deepfake-Detection

Code for Video Deepfake Detection model from "Combining EfficientNet and Vision Transformers for Video Deepfake Detection" presented at ICIAP 2021.
https://dl.acm.org/doi/abs/10.1007/978-3-031-06433-3_19
MIT License
237 stars 60 forks source link

DFDC Pre-processing timeline #42

Closed Aaditya-Kharel closed 1 year ago

Aaditya-Kharel commented 1 year ago

Hi Davide, When you pre-processed the 50 subfolders of DFDC, did you do it sequentially or in parallel? How much time did it take for you pre-process the entire DFDC dataset (just to obtain the .json and not the images from .json)? If you did it in parallel, how did you achieve parallelism in this case? Currently, its taking me approximately 3 hrs to pre-process 1300 videos (that is 8-9 seconds per video). Is this the expected behavior when running on Nvidia A30? With this rate, it will take me about 13 days to pre-process all the 120,000 videos in DFDC just to obtain the .json (that is, to run detect_faces.py on entire DFDC). I downloaded c23 DFDC videos. Please suggest if this is the expected behavior or not. Also, I would really appreciate if you could suggest ways to speed it up that worked for you to pre-process faster and not wait for days. Thanks!

davide-coccomini commented 1 year ago

Hi, sorry for the late reply. I used a Nvidia Tesla T4 and the code for detecting faces is already parallelized. It took me some days (maybe 3 or 4) to extract all the boxes from the videos, it is a pretty expensive process.