PlayVoice / vits_chinese

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!
https://huggingface.co/spaces/maxmax20160403/vits_chinese
MIT License
1.16k stars 168 forks source link

training for custom dataset #28

Closed huydang2106 closed 1 year ago

huydang2106 commented 1 year ago

Have you tried for dataset of other language. Did the model work well? How many data (in total duration) and epoch I need to have to produce good results? Thanks

MaxMax2016 commented 1 year ago

I just train chinese with baker data about ten hours;the kl_loss is about 1.0, kl_loss_r is about 1.7, mel is about 18. bert_lose

huydang2106 commented 1 year ago

will you support for training other dataset. i found that current code is quite hard-code for Baker dataset

MaxMax2016 commented 1 year ago

This is a project for study, and i have NOT enough GPU to train AISHELL3 or other. baker dataset is a baseline of chinese TTS.

huydang2106 commented 1 year ago

ok, thank for you answer, i will try with my custom dataset.