JackismyShephard / ultimate-rvc

An app for creating song covers with any RVC v2 trained AI voice from YouTube videos or audio files.
MIT License
16 stars 5 forks source link

[Issue with new version] Not able to create instrumenal version, old version was working fine #66

Closed nitinmukesh closed 1 month ago

nitinmukesh commented 1 month ago

Hello,

I did a quick check "Multi-step generation"

Step 1: vocals/instrumentals separation

Vocals is fine and instrumentals include vocals too. I can confirm it was working fine yesterday before the update.

Input song (Copyright free) https://www.youtube.com/watch?v=7UmBGMktAoI

Vocal and Instrumental https://drive.google.com/file/d/13kscfz84qn9o4Jkaf0ByyP4x8Bttr-ha/view?usp=sharing

JackismyShephard commented 1 month ago

Is the problem that the the instrumentals include traces of the vocals? Based on the tracks you linked that seems to be the problem you are referring to.

This is not actually an implementation bug. The MDXNet model used for vocal separation is not perfect. With the new version of ultimate-rvc we are using the model with slightly different settings compared to the old version. It is possible that In the old version the settings used to run the model might have worked better for this specific song.

In my experience so far, the new version of ultimate-rvc seems to be better at isolating vocals but the resulting instrumentals are not of quite as good a quality as before. Noticeably, the audio of instrumentals seems to be lower than before.

That being said, having traces of vocals in the instrumentals is most often not a big problem for this app, as you are most likely going to be overlaying the converted vocals on top of the extracted instrumentals anyways later on.

Finally, I should add that I plan to include settings for each extraction step, including the option of using different models.

JackismyShephard commented 1 month ago

@nitinmukesh Also, I would be curious to know your general thoughts on the quality of the audio output in the new version of ultimate-rvc. I was debating alot whether or not to do this upgrade, as some things are indeed worse than before. If the output audio quality is sufficiently bad I might revert to the old version, but that would have to be a very serious issue, as there are many other benefits with the new version of ultimate-rvc

nitinmukesh commented 1 month ago

I guess the issue of vocals in instrumental is because earlier MDX models were used and it was working fine. Now only RVC models are used. Please correct me if I am wrong. I would request to make it work like earlier as it is one important feature of having only instrumental version without vocals. I loved that feature.

Audio quality, even though I have not done much comparison but it was slightly better earlier. I checked with the same song earlier and with new version. Can't say it is noticeable.

JackismyShephard commented 1 month ago

@nitinmukesh You are wrong about only RVC being used now. MDXNet models are still being used. Their default settings are just a bit different. The problem with vocal residue in extracted instrumentals was also present before the update. It may be that the problem is bigger now. Did you test with many different songs? As I mentioned before, the new models with their current settings seem to produce vocal stems of better quality compared to before but also worse instrumental stems. I have a feeling this is a sort of tradeoff that is difficult to avoid.

I really appreciate all your testing. If you continue to experience problems with instrumentals, please let me know. I am trying to implement the best pipeline, but it is hard to know what works optimal for all cases.

nitinmukesh commented 1 month ago

No problem. I am ready to test any new features.

I still don't find any MDXNet models in new version

image image
JackismyShephard commented 1 month ago

the mdxnet models are stored in models/audio_separator/ They will be downloaded on the fly as needed :blush:

nitinmukesh commented 1 month ago

the mdxnet models are stored in models/audio_separator/ They will be downloaded on the fly as needed 😊

Got it, thanks.

Sorry for asking so many questions and I hope this is the last one. Is there any way to get instrumental version like before i.e. without vocals.

P.S. I'm not a developer, just a user so some technical stuff may not make sense to me.

JackismyShephard commented 1 month ago

No worries, your questions are appreciated.

Currently you can't do anything to get better instrumental stems. But, if the quality of the instrumental stems you generate in general is not good enough (bleed through of vocals) and that is something important to you (and potentially other users), then I will look into improving the performance. First of all, I will try to fiddle around with some settings. If that does not improve things, then I might revert the app to the old version.

nitinmukesh commented 1 month ago

Appreciate you looking into this. It's not the quality, quality is good. It's just the instrumental version should not have vocals.

Thank you.

nitinmukesh commented 1 month ago

@JackismyShephard

I guess I was wrong about instrument version not having vocals. I just tried original AIICoverGen and it have vocals too.

Sorry for wasting your time.