google / uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
https://arxiv.org/abs/1810.04719
Apache License 2.0
1.55k stars 320 forks source link

Dataset #21

Closed DenisSouth closed 5 years ago

DenisSouth commented 5 years ago

Which dataset should I use for training network?

wq2012 commented 5 years ago

It depends on what you want to work on.

You can use any dataset that satisfies the definition of supervised clustering, meaning you can extract sequences of features, and associate those features with ground truth labels. Features can be speaker embeddings, face embeddings, etc.

Example datasets include NIST SRE 2000 CALLHOME for speaker diarization. But for any dataset, you need to process them yourself to extract features and align the features with labels. This library only provides the API for the clustering part.

More details are in the README.md file and the paper on arXiv.