HarryVolek / PyTorch_Speaker_Verification

PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
BSD 3-Clause "New" or "Revised" License
578 stars 164 forks source link

How to train d-vector model for using on diarization with my own data? #67

Open mesut92 opened 4 years ago

mesut92 commented 4 years ago

Hi Harry; I want to use d-vector for diarization with 8kHz data. I have 9000 speakers. However my loss saturate around 5 (at 250 epoch)(Should I train with more epochs?). I use NIST data (it's around 400GB). I can not get enough performance in diarization. Do you have any suggestions? Best regards; Thanks Mesut

Gaurav470 commented 3 years ago

Hi @mesut92 , I trained my speaker verification model on 5K speakers and used this model to get d-vector embeddings and trained the UIS-RNN model on these embeddings. Then created embeddings of wav files i needed to get the prediction of. But i am only get a single speaker for all the wav files when i am sure it has multiple speakers. Thanks in advance.

asr-lord commented 3 years ago

Hi @mesut92 , I trained my speaker verification model on 5K speakers and used this model to get d-vector embeddings and trained the UIS-RNN model on these embeddings. Then created embeddings of wav files i needed to get the prediction of. But i am only get a single speaker for all the wav files when i am sure it has multiple speakers. Thanks in advance.

I've the same problem, how did you solve it @Gaurav470 ?