Open hdnh2006 opened 1 year ago
I found the cloned voice in your attached file very nice and clear. Which aspect do you think the synthesized voice is poor and robotic?
Probably is because your recorded audio is in stereo while the cloned voice is in mono. The script will make a prompt with only the first channel of your input prompt if it contains more than one channel, which is probably the reason why you found the voice not to resemble yourself.
I wonder if VALLEX still needs more training steps. It is not always good at synthesizing realistic-ness of reference voice.
Hello, thanks for the amazing job you have done.
I have tried your model with my own voice and I am getting poor results, I have attached both audios (don't clone my voice please) so you can have an idea about what is happening.
These are the commands I am runing:
Any information about how to improve the results? henry.zip