google / uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
https://arxiv.org/abs/1810.04719
Apache License 2.0
1.55k stars 320 forks source link

[Question]test on speakers untrained #40

Closed Aurora11111 closed 5 years ago

Aurora11111 commented 5 years ago

Describe the question

when I tested my own datasets which haven't been trained ,the result is really bad,can you give me some suggestion?

My background

Have I read the README.md file?

Have I searched for similar questions from closed issues?

Have I tried to find the answers in the paper Fully Supervised Speaker Diarization?

Have I tried to find the answers in the reference Speaker Diarization with LSTM?

Have I tried to find the answers in the reference Generalized End-to-End Loss for Speaker Verification?

MuruganR96 commented 5 years ago

Pytorch TIMIT d-vector embeddings for my own datasets not giving good results for UIS-RNN archietecture. then how can we improvise accuracy? any suggestions for me also.?

@Aurora11111 sir, i also facing this issue. :+1:

wq2012 commented 5 years ago

For training the d-vector model, I would suggest using all possible speaker-labelled datasets you could find, including:

The first 4 sum up to almost 10K different speakers. Although it's still much smaller than what we use internally (100K+ speakers), it would produce reasonably good results. (according to our experience, training d-vector model with less than 3~5K speakers is usually bad)

Then for training UIS-RNN, you could consider using:

The last one should not cover the same speakers to your testing test, but should have the same acoustic environment and dialogue style to your testing set. UIS-RNN is for supervised diarization, meaning you cannot expect it to work when trained in one domain and tested in a totally different domain: https://www.youtube.com/watch?v=pGkqwRPzx9U&t=7m56s

Aurora11111 commented 5 years ago

@wq2012 thanks for your suggestions,I will have a try.