myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell.
https://research.myshell.ai/open-voice
MIT License
29.75k stars 2.93k forks source link

Example for custom model training #5

Closed Selectorrr closed 10 months ago

Selectorrr commented 11 months ago

Hi @Zengyi-Qin The paper looks great. Unfortunately the pre-training model can only work with English, although the examples contain other languages as well, which is misleading. I tried adding a new language by modifying the code (adding tags and a converter to phonemes) and even managed to synthesize audio, but unfortunately it only looks a bit like promt. Are you planning to open access (add an example) to train a custom model so that the community can add their own languages and train the model on their own dataset?

egoist-sx commented 10 months ago

Will add examples soon.

Zengyi-Qin commented 10 months ago

The tone color converter now supports multiple languages, no matter whether the language exists in the MSML training set. Please see demo_part2.ipynb. For the base speaker model, the community can directly train their VITS to add a new language. We will also provide a Chinese base speaker model soon.

Zengyi-Qin commented 10 months ago

Since there is no follow-up question from Selectorrr, the issue is marked temporarly closed.

sneedger commented 10 months ago

Since there is no follow-up question from Selectorrr, the issue is marked temporarly closed.

How/where do I get Italian for text-to-speech?

ordigital commented 9 months ago

The tone color converter now supports multiple languages, no matter whether the language exists in the MSML training set. Please see demo_part2.ipynb. For the base speaker model, the community can directly train their VITS to add a new language. We will also provide a Chinese base speaker model soon.

Are there any instructions how to «directly train VITS to add new language» that will allow checkpoint to be usable just as existing ones (EN, ZH) through api.py? With what tools EN i ZH models were trained and generated?

ordigital commented 9 months ago

Since there is no follow-up question from Selectorrr, the issue is marked temporarly closed.

How/where do I get Italian for text-to-speech?

A the moment the easiest way is to use any TTS with default voice speaking in your language that will be used as a base for cloning instead of paid OpenAI suggested in demo_part2.ipynb. I've tested it with coqui TTS using xtts2 (using different base speaker than target speaker for OpenVoice), tts_models/pl/mai_female/vits and mms-tts using polish checkpoint but all effects of OpenVoice cloning was very poor and not even close to xtts2 voice cloning.

Selectorrr commented 9 months ago

Since there is no follow-up question from Selectorrr, the issue is marked temporarly closed.

How/where do I get Italian for text-to-speech?

A the moment the easiest way is to use any TTS with default voice speaking in your language that will be used as a base for cloning instead of paid OpenAI suggested in demo_part2.ipynb. I've tested it with coqui TTS using xtts2 (using different base speaker than target speaker for OpenVoice), tts_models/pl/mai_female/vits and mms-tts using polish checkpoint but all effects of OpenVoice cloning was very poor and not even close to xtts2 voice cloning.

Thanks for sharing your experience, it saved me a lot of time

chazo1994 commented 4 months ago

Could you share the code for training or finetune the openvoice v2 model for unseen language.

ismailgokhanuslu commented 4 months ago

Can you please share the code/example/guide for training and adding a new base model for a new language?

ismailgokhanuslu commented 3 months ago

any chance to give some instructions for adding a new language to speak?