microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
MIT License
1.09k stars 113 forks source link

SpeechT5: extracting Chinese speaker embedding #50

Open QQ-777777 opened 1 year ago

QQ-777777 commented 1 year ago

Hi, I have the same question as https://github.com/microsoft/SpeechT5/issues/16#issuecomment-1516257038. My training dataset is Chinese, so can i use speechbrain/spkrec-xvect-voxceleb to extract speaker embedding for pre-training?

mechanicalsea commented 1 year ago

Usually, language mismatch degrades the performance of speaker embedding, but it is still available to represent speaker for Chinese speaker embedding extracted by speechbrain/spkrec-xvect-voxceleb.

If you want to improve the representative of speaker embedding in Chinese dataset, we recommend to retrain/adapt speaker model on your own datasets.

StephennFernandes commented 1 year ago

@mechanicalsea what about when extracting speaker embeddings on a multilingual dataset ? i am trying to build a multilingual version of SpeechT5 --> mSpeechT5

mechanicalsea commented 1 year ago

@mechanicalsea what about when extracting speaker embeddings on a multilingual dataset ? i am trying to build a multilingual version of SpeechT5 --> mSpeechT5

Take advantage of the pre-trained speaker model, such as ECAPA-TDNN in [Here] and so on (https://github.com/speechbrain/speechbrain), one of which is used in our script of extracting speaker embeddings. Or, to create multilingual / language-independent speaker recognition model.

StephennFernandes commented 1 year ago

@mechanicalsea thanks!

Srija616 commented 11 months ago

Hi @StephennFernandes Wanted to know if you were able to get the speechbrain/spkrec-xvect-voxceleb embeddings working in your multilingual setting. Is the synthesized speech of good quality without any mechanical artifacts?

marmot508 commented 8 months ago

@mechanicalsea what about when extracting speaker embeddings on a multilingual dataset ? i am trying to build a multilingual version of SpeechT5 --> mSpeechT5 hi, Stephenn, do you have any progress i am interesting to do the same thing, but on chinese. anyone want to join a share effort ?