Hi. I found this TTS and currently testing it out. But i noticed that it is cutting off too early very often.
Sometimes it is missing half a word, sometimes whole words at the end. It seems it gets worse with longer texts.
Is there any way to improve this? I already played around with max_new_token, top_k and top_p etc. but had no luck.
It's the problem of the llama model itself. Since this model is relatively small, it cannot handle too long or too short (less than 5 words) sentences. Generate the audio of about 15~30s is better.
Hi. I found this TTS and currently testing it out. But i noticed that it is cutting off too early very often. Sometimes it is missing half a word, sometimes whole words at the end. It seems it gets worse with longer texts.
Is there any way to improve this? I already played around with max_new_token, top_k and top_p etc. but had no luck.