Flagging multi-speaker video from arbitrary dataset

hyunkeup / DeepFake-Video-Detection

DeepFake Detector

MIT License

3 stars 0 forks source link

Flagging multi-speaker video from arbitrary dataset #10

Open aromanusc opened 5 months ago

aromanusc commented 5 months ago

The task is to build a utility script that will identify videos from a dataset that contain more than one active face/speaker. The output from the script should be a .csv file containing:

, The file will then be used by our dataloaders to discard the videos with more than one face/speaker in the scene. Only one face should be present at least in the first model training we are doing. The design of this script should be modular enough so we can batch process any given dataset.

cy3021561 commented 5 months ago

Steve found out that the Deepfake Detection Challenge Dataset already has a JSON file of labeling. We thought it could be good to just add another key-value (, ) pair in it.

Something like this:

Does it make sense? Or I could also output a separate CSV as well.

hyunkeup commented 5 months ago

We already have metadata.json, @cy3021561 and I discussed adding the number of people in the metadata.json.

aromanusc commented 5 months ago

Oh I see! Good call, that is great. Sure let's add the speaker_count there to avoid having another metadata file floating around.