CorentinJ / Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time
Other
51.54k stars 8.64k forks source link

Does this project support Chinese? #1244

Open qinli-jian opened 10 months ago

qinli-jian commented 10 months ago

Does this project support Chinese?

moaazelashaal commented 3 months ago

Did u find any solution to run in Chinese language ? @qinli-jian

SkylineYang commented 3 months ago

In fact, I tried to infer Chinese sentences using this RTVC, but those speakers were like just whispering, not saying anything at all, even no electric noises. Maybe this algorithm doesn't support Chinese.

moaazelashaal commented 3 months ago

@SkylineYang so, did u tried any other algorithms, any voice cloning with chinese language?

SkylineYang commented 3 months ago

https://github.com/CMsmartvoice/One-Shot-Voice-Cloningthis can be used to VC Chinese, but the quality is poorly low, with loud electric noises. https://github.com/Plachtaa/VITS-fast-fine-tuning this can also be used like one-shot voice cloning, but you should input about 10-20 audios and train own models for 100 epochs (in 10 mins), and the quality is good enough.

moaazelashaal commented 3 months ago

Okay, i will try both of them and maybe we can get in touch through 微信 in the future if i faced a problem

SkylineYang commented 3 months ago

做毕设ing,也在用这两个,maybe后面会发现好用的项目

qinli-jian commented 3 months ago

Did u find any solution to run in Chinese language ? 你有没有找到任何用中文运行的解决方案?@qinli-jian

not have

qinli-jian commented 3 months ago

https://github.com/CMsmartvoice/One-Shot-Voice-Cloningthis can be used to VC Chinese, but the quality is poorly low, with loud electric noises. https://github.com/Plachtaa/VITS-fast-fine-tuning this can also be used like one-shot voice cloning, but you should input about 10-20 audios and train own models for 100 epochs (in 10 mins), and the quality is good enough.https://github.com/CMsmartvoice/One-Shot-Voice-Cloningthis 可以用VC中文,但质量差,电噪声大。https://github.com/Plachtaa/VITS-fast-fine-tuning 这也可以像一次性语音克隆一样使用,但你应该输入大约 10-20 个音频并训练自己的模型 100 个 epoch(在 10 分钟内),并且质量足够好。

great!

moaazelashaal commented 3 months ago

@SkylineYang what about this, i found it in ur profile https://github.com/SkylineYang/TensorFlowTTS

SkylineYang commented 3 months ago

difficult to use, lots of models in it, I won't recommend to rookies(like me)