Closed wxystudio closed 3 years ago
Sorry I re-check it and find that there are even 285 lost data in train set, the total number of video and audio train set data is 1092009, the number of mouth train set data is 1007279, the number of mouth seen_heard_test set data is 84445, but there are 285 data can't be found anywhere.
Could you help me with it? or do you just ignore them?
my problem, sorry
I have met the same question. Could you share how you solved the problem? @wxystudio
Sorry I re-check it and find that there are even 285 lost data in train set, the total number of video and audio train set data is 1092009, the number of mouth train set data is 1007279, the number of mouth seen_heard_test set data is 84445, but there are 285 data can't be found anywhere. Could you help me with it? or do you just ignore them?
When reading the data set, I found that the total amount of training data and test data of voxceleb2 is 1128246, while the total data of mouth_roi is 1127961. Compared with the data, it is found that there are 285 data dismissed. This phenomenon seems to be a problem with what you said. How did you solve it? @wxystudio
I think the number of files in mp4, audio, mouth_roi_train is identity. when I check it, there are 1092003 videos in mp4/train and audio/train, but 84730 of which are lost in mouth_roi/train. There is no .h5 file of the same name in mouth_roi/train, such as:
video_path: mp4/train/id04262/96JSsr9Q00k/00009.mp4 mouthroi_path: mouth_roi/train/id04262/96JSsr9Q00k/00009.h5 audio_path: audio/train/id04262/96JSsr9Q00k/00009.wav
video_path: mp4/train/id04262/PX8fGdzDlEs/00011.mp4 mouthroi_path: mouth_roi/train/id04262/PX8fGdzDlEs/00011.h5 audio_path: audio/train/id04262/PX8fGdzDlEs/00011.wav