clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition
MIT License
1.05k stars 272 forks source link

Sorry,I have two problems #2

Closed wuqiangch closed 4 years ago

wuqiangch commented 4 years ago

@joonson Hi,I have two problems bellow:

  1. Do you run it on multi-gpus? I dit it ,but I get Segmentation fault.Although I can run it on single gpu,but i think it is too slow.
  2. gsize_dict = {'proto':args.nSpeakers, 'triplet':2, 'contrastive':2, 'softmax':1, 'amsoftmax':1, 'aamsoftmax':1, 'ge2e':args.nSpeakers, 'angleproto':args.nSpeakers}. In the paper, For 'proto'/'angleproto' ,That nSpeakers is 2 can get best result. For 'ge2e' That nSpeakers is 3 can get best result. Maybe the nSpeakers changes with the different datasets?
joonson commented 4 years ago
  1. We were unable to replicate the same performance with PyTorch DataParallel. This is not recommended.
  2. For 'proto' / 'angleproto', the value of nSpeakers depends on the test scenario. For example if there are 3 enroll clips for speaker in deployment scenario, nSpeakers 3+1=4 would usually work best. This may depend on the data.
llearner commented 4 years ago

@joonson Hi,I have two problems bellow:

  1. Do you run it on multi-gpus? I dit it ,but I get Segmentation fault.Although I can run it on single gpu,but i think it is too slow.
  2. gsize_dict = {'proto':args.nSpeakers, 'triplet':2, 'contrastive':2, 'softmax':1, 'amsoftmax':1, 'aamsoftmax':1, 'ge2e':args.nSpeakers, 'angleproto':args.nSpeakers}. In the paper, For 'proto'/'angleproto' ,That nSpeakers is 2 can get best result. For 'ge2e' That nSpeakers is 3 can get best result. Maybe the nSpeakers changes with the different datasets?

Do you resolve the "Segmentation fault" problem? I met the same problem as you.

joonson commented 4 years ago

I have not tried to use multi-gpu training after this, since single GPU worked fine.