about 1/10 data in mouth_roi_train is lost

facebookresearch / VisualVoice

Audio-Visual Speech Separation with Cross-Modal Consistency

Other

218 stars 35 forks source link

about 1/10 data in mouth_roi_train is lost #10

Closed wxystudio closed 3 years ago

wxystudio commented 3 years ago

I think the number of files in mp4, audio, mouth_roi_train is identity. when I check it, there are 1092003 videos in mp4/train and audio/train, but 84730 of which are lost in mouth_roi/train. There is no .h5 file of the same name in mouth_roi/train, such as:

video_path: mp4/train/id04262/96JSsr9Q00k/00009.mp4 mouthroi_path: mouth_roi/train/id04262/96JSsr9Q00k/00009.h5 audio_path: audio/train/id04262/96JSsr9Q00k/00009.wav

video_path: mp4/train/id04262/PX8fGdzDlEs/00011.mp4 mouthroi_path: mouth_roi/train/id04262/PX8fGdzDlEs/00011.h5 audio_path: audio/train/id04262/PX8fGdzDlEs/00011.wav

wxystudio commented 3 years ago

Sorry I re-check it and find that there are even 285 lost data in train set, the total number of video and audio train set data is 1092009, the number of mouth train set data is 1007279, the number of mouth seen_heard_test set data is 84445, but there are 285 data can't be found anywhere.
Could you help me with it? or do you just ignore them?

wxystudio commented 3 years ago

my problem, sorry

dengyuanjie commented 2 years ago

I have met the same question. Could you share how you solved the problem? @wxystudio

dengyuanjie commented 2 years ago

Sorry I re-check it and find that there are even 285 lost data in train set, the total number of video and audio train set data is 1092009, the number of mouth train set data is 1007279, the number of mouth seen_heard_test set data is 84445, but there are 285 data can't be found anywhere.
Could you help me with it? or do you just ignore them?

When reading the data set, I found that the total amount of training data and test data of voxceleb2 is 1128246, while the total data of mouth_roi is 1127961. Compared with the data, it is found that there are 285 data dismissed. This phenomenon seems to be a problem with what you said. How did you solve it? @wxystudio