FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://funaudiollm.github.io/
Apache License 2.0
2.79k stars 262 forks source link

docker 版本运行,一个模型能支持4种模式? #72

Closed lonngxiang closed 2 weeks ago

lonngxiang commented 2 weeks ago

看运行只有CosyVoice-300M基础模型,下面client使用能分别使用sft|zero_shot|cross_lingual|instruct 4中模型?我看前面文档有写CosyVoice-300M-sft是tts,CosyVoice-300M直接是克隆,CosyVoice-300M-instruct 是加入语态控制 docker run -d --runtime=nvidia -p 50000:50000 cosyvoice:v1.0 /bin/bash -c "cd /opt/CosyVoice/CosyVoice/runtime/python && python3 server.py --port 50000 --max_conc 4 --model_dir iic/CosyVoice-300M && sleep infinity"

python3 client.py --port 50000 --mode <sft|zero_shot|cross_lingual|instruct>

aluminumbox commented 2 weeks ago

no, the mode should be compatible with loaded model. for sft, please use cosyvoice-300m-sft, for zeroshot and cross lingual, use cosyvoice-300m, for instruct, use cosyvoice-300m-instruct. the command is just an example. please see readme for details