Closed muhammadarslanshahzad closed 1 year ago
Please crop/align the facial video following the VoxCeleb2 preprocessing method. As for the metadata, please follow the following fields description to build json file by yourself.
file
: str. The file path.
n_fakes
: int. The number of fake segments. It should be 1
for classification.fake_periods
: list[list[float]]. A list of fake segments' start and end timestamp pairs, for example [[2.5, 2.78], [3.1, 3.23]]
means there are 2 fake segments. One is from 2.5 second to 2.78 second, another is from 3.1 second to 3.23 second. The legnth of this list should fit the n_fakes
. It should be [[0, duration]]
for classification, as we consider the whole video is a fake segment from the begining to the last.duration
: float. length of the video in seconds.original
: str. The original real video of fake video. Set to null
for real video.modify_video
: bool. The visual modality is modified or not.modify_audio
: bool. The audio modality is modified or not.split
: The train/dev/test split.video_frames
: The number of frames for this sample.audio_channels
: The number of audio channel. 1 for mono channel and 2 for stereo.audio_frames
: The length of the audio wave form array.
I need help how to preprocess the video and generate the meta file. So I can use the batfd model for single video inference. and to perform the classification