karim23657 / Persian-tts-coqui

Persian/Farsi text to speech(TTS) training using coqui tts
MIT License
96 stars 17 forks source link

Hi, Multi Speaker Tutorial #3

Open Veria70 opened 1 year ago

Veria70 commented 1 year ago

Hi Karim, How to use Multi Speaker ViTS Train.py (Kamtera/persian-tts-multispeaker-vits) for training Multi Speaker or Fine-Tuning Model? could you help me? Best Regard.

karim23657 commented 1 year ago

Hi , please have a look at : https://huggingface.co/Kamtera/persian-tts-multispeaker-vits there is training code I'v used and training weights. also I'v added these notebooks in repo : https://github.com/karim23657/Persian-tts-coqui/tree/main/recepies/vits/multispeaker I followed these tutorials: readme -> https://github.com/Edresson/YourTTS#reproducibility my code inspired from -> https://github.com/coqui-ai/TTS/blob/dev/recipes/vctk/yourtts/train_yourtts.py

*notice: In my code i used a custom dataset loader called mozilla_with_speaker here is my fork from tts package ,and where I edited: https://github.com/karim23657/TTS/blob/3ba73bf488504bc689e3d6d954e1b5220cbad577/TTS/tts/datasets/formatters.py#L16

I would be very thankful if you share your works and models with us. Feel free to ask your any question.

Veria70 commented 1 year ago

Thank you so much. I will check them. I have Tesla A100 40GB and your train code and config help me so much. i will share my model as soon as checkpoint released in my system. Best regards.

kfatehi commented 1 year ago

@Veria70 I tried running inference on Karim's pretrained multispeaker model and all 3 voices produced bad wav files. I don't know if it's my fault. I have reverted to using his pretrained "female vits" model which works great. I am very excited to try the result of your training!

Veria70 commented 1 year ago

@kfatehi Hi Keyvan, it's not bad wav i know whats problem with that voice. the problem is Accent in Dataset. you need to use Mozilla common voice dataset and training with it. Good Luck