FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://funaudiollm.github.io/
Apache License 2.0
4.73k stars 476 forks source link

base模型flow-matching从头训练的recipe会开源吗 #292

Open vincentspeech opened 4 weeks ago

vincentspeech commented 4 weeks ago

感谢开源这个优秀的项目,但我注意到base模型flow-matching好像只开放了SFT的训练配置和recipe,请问 17wh从头训练的训练recipe会开源吗?

aluminumbox commented 4 weeks ago

the training config for 17wh is cosyvoice.yaml, you can training from scratch if you have enough data

JohnHerry commented 3 weeks ago

how to adjust the flow config when training with 16K samples? and, is the x-vector value used in training the flow an average of the speaker sampels? or it is a instance one extracted from the target mel?