Open gjin10969 opened 11 months ago
https://github.com/rhasspy/piper/blob/master/notebooks/piper_multilingual_training_notebook.ipynb there's no option to train other than from english checkpoint in this notebook, but you can adapt the code, also you will need to build ljspeech dataset for japanese and finetune maybe a chinese checkpoint, which is more similar to your language. https://huggingface.co/datasets/rhasspy/piper-checkpoints/tree/main/zh/zh_CN/huayan/medium finetune chinese model can be more fast than other languages checkpoints. portuguese models were finetuned from english checkpoints and are very very good. Also, i tried to finetune another portuguese model from english checkpoint, and it worked pretty well
@BornSaint @synesthesiam I'm interested in extending this codebase to Japanese. Japanese phoneme dict will be used? If so, could you suggest a short explanation to work on it? And how many hours dataset should be used in minimum to adapt other language's pretrained model to Japanese to achieve natural result near to english model? We have a phoneme dict (g2p dict) and original dataset with more than ~10 hours by high quality recording. In any way, I'll read the code at first.
@BornSaint picked back up the this thread. I have tried to fine-tune a Chinese model using chinese voice but the phonetics is off. I have provide ~500 wav files. I assume that's more related to the issue with piper-phonetics instead of the core piper lib? there are other libraries that provide better chinese phonetics, is there a way i can use those instead of espeak-ng / piper-phonetics ?
There is a misconception that Chinese is similar just because it is Asian. The language and pronunciation are completely different. In terms of pronunciation and characteristics, Spanish, Portuguese, and Turkish are said to be closer to Japanese.
is there japanese training fine tune the model?