taylorlu / Speaker-Diarization

speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Apache License 2.0
455 stars 124 forks source link

Speaker-Diarization for 2 person conversation #55

Open ArvindSharma18 opened 2 years ago

ArvindSharma18 commented 2 years ago

@taylorlu, I would like to appreciate your effort for this repo! I have a small doubt though while trying the Speaker Diarization for .wav file with 2 speakers, I am getting output for 4 different speakers. I would really like to know if there is any way we can change the number of speakers without starting from scratch? It would be really helpful for me!

taylorlu commented 2 years ago

No, the model does not support specifying the speaker number, the result all depends on the trained model (speaker and uisrnn) you provided.

ArvindSharma18 commented 2 years ago

Thanks for your quick response.

ArvindSharma18 commented 2 years ago

Hi, I tried to generate my own embedding using generate_embeddings.py with tensorflow 1.x , pytorch 1.3 and keras 2.2.4, but I am unable to run it because of compatibility issues. Can you help with suitable versions to run the code?