clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition
MIT License
1.03k stars 272 forks source link

issues with using triplet as loss function to train the model #25

Closed pengcheng-tech closed 4 years ago

pengcheng-tech commented 4 years ago

Hi,

I am trying to train a model using below command, but the program seems to be stuck. I am new in DNN, could you please give me some hint about the parameters. I fetched audio file info of only the first 100 speakers from the train.list and test.list and arranged them as train_list_100speakers.txt and veri_test_100speakers.txt down below.

From the paper, I thought I should give --nSpeakers the value 2, and give --batch_size 100 if I would like to achieve batch size as 200 in the paper.

Below is the training command I used:

python ./trainSpeakerNet.py --model ResNetSE34L --encoder SAP --trainfunc triplet --optimizer adam --save_path data/exp3 --nSpeakers 2 --batch_size 100 --max_frames 200 --scale 30 --margin 0.3 --train_list ../voxceleb_data/train_list_100speakers.txt --test_list ../voxceleb_data/veri_test_100speakers.txt --train_path ../voxceleb_data/voxceleb2 --test_path ../voxceleb_data/voxceleb1 > nohup_triplet.txt 2>&1

However, I got errors like below a lot.

2020-05-28 11:52:45 24 Training ResNetSE34L with LR 0.000902...
Processing (400/490) Loss 0.247639 EER/T1 51.500% - 350.12 Hz Q:(0/10)
2020-05-28 11:52:47 LR 0.000902, TEER 51.50, TLOSS 0.247639

2020-05-28 11:52:47 25 Training ResNetSE34L with LR 0.000902...
Processing (400/499) Loss 0.259803 EER/T1 55.000% - 349.58 Hz Q:(0/10)Exception in thread Thread-10:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.7/threading.py", line 865, in run
    self._target(*self._args, **self._kwargs)
  File "/external/voxceleb/voxceleb_trainer/DatasetLoader.py", line 103, in dataLoaderThread
    feat.append(loadWAV(self.data_list[ij][ii], self.max_frames, evalmode=False));
IndexError: list index out of range

Could you please let me know where the index error come from? I guess it is from the --batch size part but don't have a clue on how to solve it

Thanks in advance

jlian2 commented 4 years ago

I guess you should decrease batch size to such as 20?

joonson commented 4 years ago

I guess you should decrease batch size to such as 20?

Yes, the number of classes should be greater than the batch size.