Hi,
I have a 85 hr of Chinese audio voice at 44100 hz to fintuning en-us/lessac/medium .ckpt, but effect not good.
And my loss_gen_all looks so high, loss_disc_all looks normal.
Questions:
Sample Rate Conversion: Is it advisable to convert the sample rate from 44,100 Hz to 22,050 Hz before fine-tuning? Could this conversion be contributing to the high loss_gen_all?
Language Adaptation: Since I am fine-tuning an English model with Chinese data, are there specific configurations or adjustments you recommend to improve performance?
Model Compatibility: Are there any known issues or limitations when fine-tuning the en-us/lessac/medium.ckpt model with a non-English dataset?
Any guidance or suggestions you could provide would be greatly appreciated.
Hi, I have a 85 hr of Chinese audio voice at 44100 hz to fintuning en-us/lessac/medium .ckpt, but effect not good. And my loss_gen_all looks so high, loss_disc_all looks normal.
Questions:
Sample Rate Conversion: Is it advisable to convert the sample rate from 44,100 Hz to 22,050 Hz before fine-tuning? Could this conversion be contributing to the high loss_gen_all? Language Adaptation: Since I am fine-tuning an English model with Chinese data, are there specific configurations or adjustments you recommend to improve performance? Model Compatibility: Are there any known issues or limitations when fine-tuning the en-us/lessac/medium.ckpt model with a non-English dataset?
Any guidance or suggestions you could provide would be greatly appreciated.
Thank you for your time and assistance.