SociallyIneptWeeb / AICoverGen

A WebUI to create song covers with any RVC v2 trained AI voice from YouTube videos or audio files.
MIT License
1.1k stars 259 forks source link

[Questions] Can we add the ability to adjust vocal cleanup aggression? #36

Closed MarshalSan closed 1 year ago

MarshalSan commented 1 year ago

With certain songs or songs from certain artists, it seems the AI likes clipping when the singer starts screaming in high pitches or when vocal effects are involved.

Are we able to add the ability to select presets for these instances? Or is there a trick to doing it in a better way?

The song that brought this to my attention was Mafumafu's I want to be a girl, at 1:20 into the song, the ai just struggles.

SociallyIneptWeeb commented 1 year ago

For some voice models, the dataset used to train them does not include such high pitches. That means the model won't know how to sound when given such a high pitch as input for inference. It's a limitation of some voice models and their datasets.

MarshalSan commented 1 year ago

I've tried using multiple different models, some of which I've also used for even higher pitch songs, I'll keep trying at it with other models but I do believe it's still something to do with the vocal separation.

SociallyIneptWeeb commented 1 year ago

In that case, you can try downloading ultimate vocal remover and use their best vocal separation model to obtain the cleanest vocals possible. After that, you can just pass that audio file into AICoverGen.

tanlam1703 commented 1 year ago

Can you add a few more options that change the Pitch Extraction Method, for example mangio-crepe...

SociallyIneptWeeb commented 1 year ago

Can you add a few more options that change the Pitch Extraction Method, for example mangio-crepe...

I have added in a new commit that allows you to choose between rmvpe and mangio-crepe, so you can try pulling the new changes and testing it out!

tanlam1703 commented 1 year ago

Thank you so much

accessyapps commented 8 months ago

I am reopening the issue since i have a different problem with vocal seperation. It happens on this song for example. https://www.youtube.com/watch?v=QZ5GzGYgWJw For some reason, on a lot of songs, the model does not remove the singing but only turns it quiter and this sometimes doesn't work that good, on this song for example, it sounds like the oreginal singer and the new singer are singing together. It happens in the verse and the chorus. Is there a way to fix this?