yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
MIT License
4.38k stars 340 forks source link

Issue with impropper pauses and random bursts of noise #233

Open king-dahmanus opened 2 months ago

king-dahmanus commented 2 months ago

Hello there, devs of Style TTS2, it's a great model, you really did a good job. I mainly use it on the hf demo, but there are some issues: Firstly, it pauses after the dash - symbol, so please fix it. For example, it reads white-clothed as "White. Clothed". Secondly, sometimes it does random bursts of distorted noise, skipping words. Can you find a way to fix this? Is this an issue of the pretrained model or the architecture itself? Thanks and regards