FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://funaudiollm.github.io/
Apache License 2.0
2.79k stars 262 forks source link

是否支持其他语种的合成 #10

Closed ZeeChao closed 2 weeks ago

ZeeChao commented 3 weeks ago

从readme上看貌似只支持中文和英文的tts。其他语种的是否支持呢?如果支持的话,要怎样设置指定的语种呢?

ZhihaoDU commented 3 weeks ago

支持中英日粤韩五个语种,可以在要合成文本前面加上Language ID,比如 中文:<|zh|> 英文:<|en|> 日文:<|jp|> 粤语:<|yue|> 韩语:<|ko|>

abc8350712 commented 2 weeks ago

`cosyvoice = CosyVoice('pretrained_models/CosyVoice-300M')

prompt_speech_16k = load_wav('dzq_ref.wav', 16000)

output = cosyvoice.inference_zero_shot('收到好友从远方寄来的生日礼物,那份意外的惊喜与深深的祝福让我心中充满了甜蜜的快乐,笑容如花儿般绽放。', '这是一段参考音频,请录制我的声音', prompt_speech_16k)

torchaudio.save('zero_shot.wav', output['tts_speech'], 22050)`

@ZhihaoDU 请问我这个该怎么改呢?我想指定中文,而不是粤语

aluminumbox commented 2 weeks ago

`cosyvoice = CosyVoice('pretrained_models/CosyVoice-300M')

prompt_speech_16k = load_wav('dzq_ref.wav', 16000)

output = cosyvoice.inference_zero_shot('收到好友从远方寄来的生日礼物,那份意外的惊喜与深深的祝福让我心中充满了甜蜜的快乐,笑容如花儿般绽放。', '这是一段参考音频,请录制我的声音', prompt_speech_16k)

torchaudio.save('zero_shot.wav', output['tts_speech'], 22050)`

@ZhihaoDU 请问我这个该怎么改呢?我想指定中文,而不是粤语

use <|zh|> for chinese, see readme