Plachtaa / VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Apache License 2.0
4.69k stars 705 forks source link

训练片段超过5s,模型推理生成语音就半字音,语速很快。 #487

Open create-li opened 11 months ago

create-li commented 11 months ago

训练语音切成5s,就是正常的。但是切成9s,10s等超过5s,训练后推理的声音就是半字音,语速特别快。请问up主什么原因。

VincentVanNF commented 10 months ago

同问,特别是最后一个单词直接就被吞了,但是打印出来的音标是完整的