google / uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
https://arxiv.org/abs/1810.04719
Apache License 2.0
1.56k stars 319 forks source link

How to convert audio data into test data of algorithm for testing #46

Closed zyc1310517843 closed 5 years ago

zyc1310517843 commented 5 years ago

Describe the question

A clear and concise description of what the question is.

My background

Have I read the README.md file?

Have I searched for similar questions from closed issues?

Have I tried to find the answers in the paper Fully Supervised Speaker Diarization?

Hello, I have read README.md I want to convert my audio data into the test and training data needed by the model. How can I do this? I have also tried the third-party methods provided in README.md. However, they are for specific data sets, such as TIMIT, using our own audio data can not run successfully. Thank you very much for your guidance.

wq2012 commented 5 years ago

You can use any continuous speaker embedding technique to do that.

I listed a few here: https://github.com/wq2012/awesome-diarization#speaker-embedding

Not all of them supports continuous embeddings. You need to do some research on your own here.