大模型是流式输出的情况下，tts也用gptsovits流式输出，只会播放最后一句，怎么办

dizhenx commented 2 weeks ago

大模型是流式输出的情况下，tts也用gptsovits流式输出，只会播放最后一句大模型用的websockets做流式输出，输出语句传给human接口后，gptsovits也是流式的，但是，只能播放最后一句，前面的句子貌似没来得及播放就被后面新来的流式输出给覆盖了，结果一大段话，只有最后一个流式的句子会被tts语音合成出来？这个该怎么解决？是gptsovits要设置一下，还是human接口下面，要把大模型给的流式输出一句句缓存做个队列，然后按顺序发送给tts？

lipku commented 2 weeks ago

不用打断模式，修改前端页面 interrupt:false

jmanhype commented 4 days ago

My issue is its only repeating the REF_FILE / REF_TEXT my sovits server is getting the call from mettahumanstream but its only generating the ref file audio over and over for each call

ThornbirdZhang commented 2 days ago

@jmanhype 你的是gpt-sovits？查一下它的日志看看。

jmanhype commented 2 days ago

Apologies I explored more and was able to see that if you leave text ref blank it will then use the tts text

lipku / metahuman-stream

大模型是流式输出的情况下，tts也用gptsovits流式输出，只会播放最后一句，怎么办 #256