base模型flow-matching从头训练的recipe会开源吗

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

https://funaudiollm.github.io/

Apache License 2.0

4.73k stars 476 forks source link

base模型flow-matching从头训练的recipe会开源吗 #292

Open vincentspeech opened 4 weeks ago

vincentspeech commented 4 weeks ago

感谢开源这个优秀的项目，但我注意到base模型flow-matching好像只开放了SFT的训练配置和recipe，请问 17wh从头训练的训练recipe会开源吗？

aluminumbox commented 4 weeks ago

the training config for 17wh is cosyvoice.yaml, you can training from scratch if you have enough data

JohnHerry commented 3 weeks ago

how to adjust the flow config when training with 16K samples? and, is the x-vector value used in training the flow an average of the speaker sampels? or it is a instance one extracted from the target mel?