Closed chrisspen closed 5 years ago
https://www.youtube.com/watch?v=pGkqwRPzx9U&t=23m19s
You need to find another library to compute speaker embeddings.
@chrisspen have you found how to prepare training data for uisrnn. like format of data is required and all?
Describe the question
How do you take raw audio file annotated with speaker labels and convert them into a form that can be used by uis-rnn? There's no documentation for creating your own training data from raw audio files. The toy training and test data appear to be numpy arrays, but there's no description of what these arrays represent.
My background
Have I read the
README.md
file?Have I searched for similar questions from closed issues?
Have I tried to find the answers in the paper Fully Supervised Speaker Diarization?
Have I tried to find the answers in the reference Speaker Diarization with LSTM?
Have I tried to find the answers in the reference Generalized End-to-End Loss for Speaker Verification?