中英文模型训练数据量

Plachtaa / VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Apache License 2.0

4.7k stars 704 forks source link

Open Yuan-ManX opened 1 year ago

Yuan-ManX commented 1 year ago

正常的中英文TTS模型大约需要多少语音数据的训练，才能得到比较好的效果？

AnyaCoder commented 1 year ago

取决于音源和标注的质量

Yuan-ManX commented 1 year ago

音频质量和标注比较好的情况下

AnyaCoder commented 1 year ago

对于这个项目，我用了250+ 10s左右的纯中文音频加上 10多句英文音频，纯中文练了200epochs，英文加训了100epochs效果还行

Kraehe029 commented 10 months ago

我想问一下，越练效果越差，现在连一个日语你好都会变成十几秒的不明句子是正常的吗（现在正在90epoch）