Open Koeru opened 1 year ago
Bad intonation is caused by rough phoneme annotation, but it will take a lot of time to correct. So I don't recommend using this repository to fine-tune the voices of real speakers.
You could use so-vits-svc or RVC to train a voice conversion model, then use Microsoft TTS as input to achieve another sense of tts, the effect will be much better
Thank you very much!!
Hello! Thank you very much for your advice. I would like to ask one more questions if you don't mind🙇
I've used the method above, the output is very stable! However, do we have more ways to improve natural intonation and put emotion in the speech like this vits model??
I'm recording our own voice with voice actors, but would like to use the voice in more character speech situations.
Thank you
I'm sorry I can't provide better advice, perhaps you could try seeking advice from https://github.com/VOICEVOX/voicevox
Got it. Thank you very much anyway
Can the pre-trained model fine tuned for chinese? thanks!
Hi, I tried to finetune the model with my own voice.
I recorded 100 datas to finetuning.
It generate my voice, but the intonation seems not as perfect as the original pretrained models.
Do you have any tips for preparing data or finetune ?