-
I just did a simple try with my phone call wave file, which is about 2.5 minutes, only 2 speakers in total. However, with pretrained model in this project, it returns 3 speakers and many slices contai…
-
Hi, I was trying to generate embeddings from a very small subset of VoxCeleb dataset (around 200 MB). The process created a training_data.npz file (around 2 GB), which was loaded in the training proce…
-
Hello, this code is for TIMIT data set. If you change it into your own audio data, it will not work. If you use your own audio data as a data set, how can I do it? Thank you very much for your guidanc…
-
Hi, thanks for the awesome tool!
I am trying to do multiprocessing in a class that uses Pytorch.
This is what I am doing:
from pathos.multiprocessing import ProcessingPool as Pool
im…
-
UIS-RNN is an online algorithm, but the current `predict()` API of this library is not.
If people want to deploy this library to a production environment for online use cases, an `online_predict()`…
-
Transcription is good but diarisation speaker labels are wrong sometimes, speaker 0 mapped as speaker 1 down the line and vice versa.
Am using Indian English conversation as audio input. Its conversa…
-
Hi Harry;
I want to use d-vector for diarization with 8kHz data. I have 9000 speakers. However my loss saturate around 5 (at 250 epoch)(Should I train with more epochs?). I use NIST data (it's around…
-
Currently in this open source version, crp_alpha is passed in as an argument.
We need to add the support to estimate it from training data.
-
Hello, this code is for TIMIT data set. If you change it into your own audio data, it will not work. If you use your own audio data as a data set, how can I do it? Thank you very much for your guidanc…
-
How long should speakerDiarization.py take to run on a typical system with a GPU?
I ran the speakerDiarization.py example and it segmented the file correctly. However, it's very slow. It takes abou…