google / uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
https://arxiv.org/abs/1810.04719
Apache License 2.0
1.55k stars 320 forks source link

[Question] Would it be possible to publish trained models #39

Closed rnunziata closed 5 years ago

rnunziata commented 5 years ago

Would it be possible to publish trained models for other languages...like Japanese and mandarin.

Would such models be useful in general speaker isolation for unknown speakers sequence not in training set.

How robust to background ambient sound is this model...like traffic noise.

wq2012 commented 5 years ago

It's impossible for us to publish pre-trained UIS-RNN models, because the speaker recognition models that we use are not open source.

Whether a model is robust to noise depends on whether you include such noise in the training data, when you train your speaker recognition model and UIS-RNN model.

taylorlu commented 5 years ago

You can refer to my project which integrates the vgg-speaker-recognition algorithm Speaker-Diarization

wq2012 commented 5 years ago

@taylorlu Thanks for sharing your work. Consider send a PR to add it to https://github.com/wq2012/awesome-diarization