-
My task combines both speaker diarization and speaker identification.
Since speaker embeddings are extracted during diarization anyway, it would be fantastic if the user could extract speaker embed…
-
Does anyone have an example python script that uses one on the x-vector extraction models developed here to extract embeddings? I've gone through some of the repo and have not found any such thing.
…
-
> More precisely, we use activations from the last layer of neural network as speaker embeddings. We aggregate the sigmoid outputs by summing all outputs class-wise over the whole audio excerpt to obt…
-
I have a WhisperX Python script for transcribing meetings, but the speaker diarization for German is really bad, unfortunately.
After some research I came across the fine-tuned German segmentation…
-
Hi, I have the same question as https://github.com/microsoft/SpeechT5/issues/16#issuecomment-1516257038. My training dataset is Chinese, so can i use speechbrain/spkrec-xvect-voxceleb to extract speak…
-
I want to train a new model with other dataset,but I don't find the way to get a new spk2info.dict.
-
Hello,
Thank you for your work on WavLM.
I try to reproduce the results but I have some difficulties.
First of all, I don't undestand exactly the difference between scores displayed in differen…
-
Adding here some implementation improvements that I need to do courtesy of comments from @r9y9
- [XX] Change F0 to log-F0 (and continuous)
- [] Use original speaker embedding during training,
- …
-
Hi, is there a way to utilize multiple reference audios to capture more characteristics?
I'm not to familiar how it works under the hood, but is some stacking or averaging possible to implement for…
-
How is possible change the text to speech model ? Is possible to use other .bin like voxpopuli for Italian language or other trained by ourself ? I try to add the voxpopuli.bin file in the public dire…