Open acul3 opened 1 month ago
Hey @acul3, thanks for opening this issue!
Multilinguality is something we'll try to actively support in the coming weeks.
Let me know how your effort go! I plan to write more extensively about multilingual fine-tuning in a few weeks
@ylacombe thanks for your reply and confirmation for my point
i think one challenging to train parler is need of speaker gender in audio data because we need it to create prompt (CMIIW)
my effort right now is to train gender classification on labeled data in my language(common voice is good start), and start to label my unlabelled data to it(maybe pick 0.9 confidence)
if you have other option , feel free to suggest , thanks once again
congrats to release v2 parler-tts @sanchit-gandhi @ylacombe or anyone involve
i am trying to explore reproduce multilinguality training, some question to ask if i want to train it multilingual
is it necessary(worth it) to change text encoder to support multilingual speech?,parler-tts use flant5, do you have recommended if any for multilinguality to start with?
and how about the encodec/dac ? ( i believe this it not really necessary since encodec/dac work on low level CMIIW) , or maybe i have to train/finetune speech tokenizer like facodec/speechtokenizer
will adding more noisy dataset will give you robust output?
i am planning to put 8k hours my lang(malay) dataset and mix it with english dataset ~20k hours ( subset mls english , gigaspeech , and libritts)
thank you