clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition
MIT License
1.06k stars 273 forks source link

Issue with training Common Voice Dataset #170

Closed dimuthuanuraj closed 8 months ago

dimuthuanuraj commented 1 year ago

Hi all,

I am using Common Voice Dataset to train the model again, before that I have restructured the total dataset as voxceleb (that is as follows)

// also, I have recreated the new list as follows with the CV corpus. id30001 id30001/audio/common_voice_ta_26650298.wav id30001 id30001/audio/common_voice_ta_26650300.wav id30001 id30001/audio/common_voice_ta_26650301.wav id30001 id30001/audio/common_voice_ta_26650302.wav id30001 id30001/audio/common_voice_ta_26650308.wav id30001 id30001/audio/common_voice_ta_26650309.wav id30001 id30001/audio/common_voice_ta_26650310.wav id30001 id30001/audio/common_voice_ta_26650311.wav id30001 id30001/audio/common_voice_ta_26650313.wav But when I start the training process I am having the following error. Can anyone help with that? Error- File "trainSpeakerNet_119_T_CV_corpus.py", line 314, in main() File "trainSpeakerNet_119_T_CV_corpus.py", line 310, in main main_worker(0, None, args) File "trainSpeakerNet_119_T_CV_corpus.py", line 254, in main_worker loss, traineer = trainer.train_network(train_loader, verbose=(args.gpu == 0)) File "/mnt/ricproject4/anuraj_works/git_repo/voxtrainer_VI/voxceleb_trainer/SpeakerNet.py", line 92, in train_network for data, data_label in loader: File "/home/anuraj/anaconda3/envs/basej/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in __next__ data = self._next_data() File "/home/anuraj/anaconda3/envs/basej/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data return self._process_data(data) File "/home/anuraj/anaconda3/envs/basej/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data data.reraise() File "/home/anuraj/anaconda3/envs/basej/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise raise self.exc_type(msg) soundfile.LibsndfileError:
dimuthuanuraj commented 1 year ago

There were two errors I found which cause me to the above error,

  1. Some wave files were missing.
  2. Between the first and second column of the txt file was separated by tab (\t), which should be as a space (" ").

Thank you all.