FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://funaudiollm.github.io/
Apache License 2.0
4.53k stars 457 forks source link

Instruct mode synthesis the instruct text in the final audio #159

Open shuaijiang opened 1 month ago

shuaijiang commented 1 month ago

I try the instruct mode to synthesis the audio according to the instruct text, but the audio also contain the intruct text

from cosyvoice.cli.cosyvoice import CosyVoice
import torchaudio
cosyvoice = CosyVoice('iic/CosyVoice-300M-Instruct')
output = cosyvoice.inference_instruct('在面对挑战时,他展现了非凡的勇气与智慧。', '中文男', '女声,快语速')
torchaudio.save('instruct.wav', output['tts_speech'], 22050)
aluminumbox commented 1 month ago

I tried the same code but didn't see the problem, try again

shuaijiang commented 1 month ago

thx, I get the point. The instruct text don't support Chinese?