训练片段超过5s，模型推理生成语音就半字音，语速很快。

Plachtaa / VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Apache License 2.0

4.69k stars 705 forks source link

Open create-li opened 11 months ago

create-li commented 11 months ago

训练语音切成5s，就是正常的。但是切成9s，10s等超过5s，训练后推理的声音就是半字音，语速特别快。请问up主什么原因。

VincentVanNF commented 10 months ago

同问，特别是最后一个单词直接就被吞了，但是打印出来的音标是完整的