FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://funaudiollm.github.io/
Apache License 2.0
6.31k stars 673 forks source link

training --model hift from scratch #226

Open taalua opened 3 months ago

taalua commented 3 months ago

Hi,

Thank you for the great work. I am trying to run the training using conf/cosyvoice.fromscratch.yaml with --model hift, that is to train the generator for different number of mel filters but I got error in: info_dict['loss_dict'] = model(batch, device)

I wondering if it possible to fine-tune or train the hift from scratch? or is there any way to do this?

Thanks

aluminumbox commented 3 months ago

you can use academiccodec to train your hift model, it is non-trival to add gan training in our repo, it is not supported yet

taalua commented 3 months ago

Hi, Could you give more information how to use f0 in the training? In the HiFTGenerator training, did you take the MSE loss between pred_f0 from f0_predictor and the f0 from pyworld.harvest or dio? Thanks

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 30 days with no activity.