lifeiteng / vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
https://lifeiteng.github.io/valle/index.html
Apache License 2.0
1.99k stars 320 forks source link

I trained a Chinese model, and when synthesizing long speech, the effect may deteriorate, even with pronunciation errors. Why is this? #155

Closed zhiCharon closed 1 year ago

lifeiteng commented 1 year ago

inference shoud be consist with training.