RVC-Project / Retrieval-based-Voice-Conversion-WebUI

Easily train a good VC model with voice data <= 10 mins!
MIT License
19.7k stars 3.04k forks source link

GANs #2012

Open litsa-the-dancer opened 4 weeks ago

litsa-the-dancer commented 4 weeks ago

I will make this short and quick for all of you to read. I think it would be a good idea if hifigan was switched with bigvgan for better fidelity and mel representation, or at the very least vocos. It would probably be a good idea to have this change included with RVC v3. @RVC-Boss

yxlllc commented 3 weeks ago

We have tried it, but bigvgan training is a bit difficult and the improvement is not obvious.

litsa-the-dancer commented 3 weeks ago

We have tried it, but bigvgan training is a bit difficult and the improvement is not obvious.

Im pretty sure it can in fact yield better results and help with sibilance and breathing. Not to mention, it'll probably help the GAN itself adapt to the data more efficiently with less hiccups. Not saying that it'll fix everything up just like that however it's still important to note that the better data representation we have, the better our model will be able to perform on said data theoretically. The quality itself may not receive a major boost but still, it can aid. Also having it out for public use is the best way to really test if the feature itself is actually viable.