ControlNet / LAV-DF

[CVIU] Glitch in the Matrix: A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization
https://www.sciencedirect.com/science/article/pii/S1077314223001984
Other
67 stars 8 forks source link

Facing problem with preprocessing for the single video inference #4

Closed muhammadarslanshahzad closed 1 year ago

muhammadarslanshahzad commented 1 year ago

I need help how to preprocess the video and generate the meta file. So I can use the batfd model for single video inference. and to perform the classification

ControlNet commented 1 year ago

Please crop/align the facial video following the VoxCeleb2 preprocessing method. As for the metadata, please follow the following fields description to build json file by yourself.