FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://funaudiollm.github.io/
Apache License 2.0
6.01k stars 643 forks source link

非流式输出为何要用yield? #331

Open jsntcheng opened 2 months ago

jsntcheng commented 2 months ago

最近的更新好像比较大,好像更新了流式输出相关内容,但看了下代码,感觉不太合适,已经指定stream为False了,但依旧返回的是生成器(generator),这不合理吧。

注意到cosyvoice/cli/cosyvoice.py文件下那几个推理函数都没有对stream作判断,难道是代码合并叉劈了?🤔

同时,因为这个改动,我发现webui.py里语速调节的部分也已经被去除。毕竟推理结果是个生成器,原来用的speed_change函数也不能用了呀😄

改起来倒是不难,就是不太能理解 output = cosyvoice.inference_zero_shot(target_words, ref_words, prompt_speech_16k) tts_speeches = [] for i in output: tts_speeches.append(i['tts_speech'])

调整音频速度

    audio_data, sample_rate = speed_change(torch.concat(tts_speeches, dim=1), 22050, str(target_speed))
cpken commented 2 months ago

调速你可以自己通过 torchaudio 包来实现,具体可以自己看一下官方文档,很容易的。

aluminumbox commented 2 months ago

because we want to unify api interface for both streaming and non streaming mode

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 30 days with no activity.