v-nhandt21 / ViSV2TTS

Vietnamese Voice Cloning System using Speaker Verification training on multispeaker VITS
38 stars 14 forks source link

Step-by-step adding foreign words to ViSV2TTS #10

Closed drlor2k closed 1 week ago

drlor2k commented 2 weeks ago

hello @v-nhandt21 , how are you? I have some questions can you guide me!

Problem:

  1. Goal: the model can pronounce some English words in the training set.
  2. Premise: I trained the model on a pure Vietnamese dataset (model A).
  3. My approach: I will use a small dataset (about 2 hours) that has mixed English and Vietnamese to finetuning model A into model B to achieve 1. Goal.

Questions:

  1. Is the above approach feasible?
  2. What steps do I need to take for this approach?

I found Viphoneme on github has English support, but this repo doesn't!

Thanks for taking the time to reply!

v-nhandt21 commented 3 days ago

Hi @drlor2k,

To handle Foreign languages, as my knowledge we have two main approaches:

p/s: you can take a look for this for more approach: https://github.com/tuanh123789/AdaSpeech/tree/main/G2P

p/s 2: maybe there are some multilingual end2end models that can speak all of this without phoneme control by using a unified tokenizer :))