erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
686 stars 71 forks source link

how can i add persian language to AllTalk TTS? #258

Closed lumos675 closed 2 weeks ago

lumos675 commented 2 weeks ago

I am trying to add the Persian language with my own voice to AllTalk TTS, but I have no idea where to begin. Any help would be highly appreciated.

erew123 commented 2 weeks ago

Hi @lumos675 Im currently traveling so dont have easy acess at the moment to type a lengthy detailed reply and find all the resources I want to point you towards. Ive been asked this before, but cannot for the life of me find my replies on here (its possible people have deleted their github accounts which may delete my replies to them as well).

Long story short however, traning a new language is more involved than training a new voice alone. As the model has to learn a whole new language.

There are some specific resources on the Coqui forum about adding/training new languages and examples of datasets needed to do so. I believe you would also need to setup a custom tokenizer for an currently non-supported language (which I believe persian is not current supported within the model).

You can find some resources here about general training of the XTTS model https://docs.coqui.ai/en/latest/index.html

and more specific discussions https://github.com/coqui-ai/TTS/discussions

I beleve polish was one language someone did a full training session of and is discussed in the forum there. As I am travelling though, I cant search for them currently. If you want to have a look through the Coqui forums though, you may well find some information and answers there. Im not an expert on training new languages, though I know it is possible and does require more hours of training than a normal voice training.

Hope that helps, Ill close the ticket. If you want me to try hunt down these other links off the forums, youre welcome to reply again, but it will be a few days before I am back from traveling.

Thanks