Anjok07 / ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.
MIT License
17.46k stars 1.3k forks source link

The answer to the most asked question: What is the model which provides the best results? [Read this, very important info inside!] #344

Open thenormal opened 1 year ago

thenormal commented 1 year ago

Hello everyone.

I would like to address a question I've repeatedly seen published both in this forum and other ones as well. Given the amount of available modules which have now been integrated into UVR, obviously a lot of people are confused as which one may provide the best results. The question I see a lot, therefore, is the following:

"What is the best module which provides the best results? What setting should I use with it?" and its variations.

Before I give you the answer, let me introduce to the following website: mvsep.com -- This is a website where you can upload a song of your choice and utilize all of the Stem Separation AI modules currently available to have it processed. I encourage you to check it out, it's an amazing tool. Keep in mind that due to high traffic, it is likely you will have to wait in a queue for your songs to be processed.

The developers over at Mvsep launched a very interesting initiative months ago, called "Quality Checker". As I mentioned before, there are plenty of modules available and Mvsep thought about a method to establish which of them offers the best results. This is done by downloading a standard database and have a given module process it, then uploading the results onto their site. Check it out here: https://mvsep.com/quality_checker/

The results and corresponding metrics are published on their website. You can check them here: https://mvsep.com/quality_checker/leaderboard.php -- This is called the "Leaderboard".

So, back to the question: Which module provides the best results? Well, you guessed it... The answer is provided by the Leaderboard itself. As you can see, there is no single module which offers the best results, but rather it is recommended to use a combination of modules. UVR has a function integrated within it called "Ensemble", which does exactly that: It processes a given song by utilizing one or more modules of your choice.

Now, back to the Leaderboard. At the time I'm writing this, the following combination provides the highest results:

MDX-Net: kim vocal model fine tuned (old) + UVR-MDX-NET_Main_427 + Demucs: v4 | htdemucs_ft - Ensemble Algorithm: Avg/Avg - Shifts: 10 - Overlap: 0.25

You notice they have used three different modules here (Kim vocal, MDX Net Main 427, and the latest fine-tuned demucs v4). If you hover your mouse to the "?" in the page corresponding to the combo, it also provides you with the UVR settings which were used to create the combo.

So, there you have it. You should check the Leaderboard page often to see which combo is getting the highest score, and then simply replicate it with UVR. Keep in mind that modules are constantly modified and/or trained, so it is likely the Leaderboard will change quite often.

Furthermore, you can provide your own methodology (combo) and results by visiting the Quality Checker page like I wrote above, download the database, and apply your own chosen modules, then uploading the final results. I strongly encourage everyone to do so: the more tests, the more results.

As a final note, I want to thank @Anjok07 for his amazing job on UVR, which has now turned into a fantastic, and best tool at the world's disposal to create stems. Thanks a lot for all of your hard work!

Anjok07 commented 1 year ago

Thank you so much for posting this!

Lukasz858585 commented 1 year ago

What is this model called "kim vocal model fine tuned" ? I don't see such a name in MDX-Net models

Anjok07 commented 1 year ago

What is this model called "kim vocal model fine tuned" ? I don't see such a name in MDX-Net models

It's available to download via the Download Center

thenormal commented 1 year ago

What is this model called "kim vocal model fine tuned" ? I don't see such a name in MDX-Net models

It's available to download via the Download Center

I'm guessing on UVR it's called Kim_Vocal_1?

czhou commented 1 year ago

didn't find any model like Kim_xxx

Mr-Negative commented 1 year ago

info ? icon doesnt work on leaderboard in any of my browsers, on phone also

thenormal commented 1 year ago

info ? icon doesnt work on leaderboard in any of my browsers, on phone also

Without clicking on the ? icon, you have to hover your mouse pointer on it and keep it there for a few seconds. A pop-up message will appear with the UVR settings the combo was created with

Mr-Negative commented 1 year ago

ok this is weird uix wise (especially on phones), but very thank you for answer, it works. Amazing job

czhou commented 1 year ago

Reference i

OK, I found it in the download center, however, it;s not listed in the 'Manual Download' .

ybhka2022 commented 1 year ago

这个叫做“kim vocal model fine tuned”的模型是什么? 我在 MDX-Net 模型中没有看到这样的名称

可通过下载中心下载 Failed to stop during model download

ybhka2022 commented 1 year ago

Failed to stop during model download

bearwal commented 1 year ago

what is "shift: 10" ?

xdanielc commented 1 year ago

Are this tables aimed to be used for having a clean instrumental or a clean vocal?

Because I want a clean vocal and the corresponding instrumental of a clean vocal might have some voice left on the instrumental yet be very clean, but a very clean instrumental might correspond to an acapella that is not that clean.

thenormal commented 1 year ago

Are this tables aimed to be used for having a clean instrumental or a clean vocal?

Because I want a clean vocal and the corresponding instrumental of a clean vocal might have some voice left on the instrumental yet be very clean, but a very clean instrumental might correspond to an acapella that is not that clean.

Sort that list by SDR Vocals and find the model(s) with the highest score to experiment with

ybhka2022 commented 1 year ago

大家好。

我想解决一个我在这个论坛和其他论坛上反复看到的问题。考虑到现在集成到 UVR 中的可用模块的数量,显然很多人都不清楚哪个模块可以提供最好的结果。因此,我经常看到的问题如下:

“提供最佳结果的最佳模块是什么?我应该使用什么设置?” 及其变体。

在我给你答案之前,让我介绍一下以下网站:mvsep.com——这是一个你可以上传你选择的歌曲并利用当前可用的所有 Stem Separation AI 模块对其进行处理的网站。我鼓励您检查一下,这是一个了不起的工具。请记住,由于流量大,您可能需要排队等候处理您的歌曲。

Mvsep 的开发人员几个月前发起了一项非常有趣的计划,称为“质量检查器”。正如我之前提到的,有很多模块可用,Mvsep 考虑了一种方法来确定哪些模块提供最佳结果。这是通过下载标准数据库并让给定的模块对其进行处理,然后将结果上传到他们的站点来完成的。在这里查看: https: //mvsep.com/quality_checker/

结果和相应的指标发布在他们的网站上。您可以在此处查看它们: https: //mvsep.com/quality_checker/leaderboard.php——这称为“排行榜”。

那么,回到问题:哪个模块提供最好的结果?好吧,您猜对了......答案由排行榜本身提供。如您所见,没有单一模块可以提供最佳结果,而是建议使用模块组合。UVR 集成了一个名为“Ensemble”的功能,它就是这样做的:它通过使用您选择的一个或多个模块来处理给定的歌曲。

现在,回到排行榜。在我写这篇文章时,以下组合提供了最高的结果:

MDX-Net:kim 人声模型微调(旧)+ UVR-MDX-NET_Main_427 + Demucs:v4 | htdemucs_ft - 整体算法:平均/平均 - 班次:10 - 重叠:0.25

你注意到他们在这里使用了三个不同的模块(Kim vocal、MDX Net Main 427 和最新的微调 demucs v4)。如果将鼠标悬停在“?” 在与组合对应的页面中,它还为您提供了用于创建组合的 UVR 设置。

所以你有它。您应该经常查看排行榜页面以查看哪个组合获得最高分,然后简单地使用 UVR 复制它。请记住,模块会不断修改和/或训练,因此排行榜很可能会经常更改。

此外,您可以像我上面写的那样访问 Quality Checker 页面,下载数据库,应用您自己选择的模块,然后上传最终结果,从而提供您自己的方法(组合)和结果。我强烈鼓励大家这样做:测试越多,结果越多。

最后一点,我要感谢@Anjok07以表彰他在 UVR 方面的出色工作,该工作现已成为世界上用于创建词干的出色且最佳的工具。非常感谢您的辛勤工作!

Kim_vocal_1 This model extracts accompaniment with vocal residue

Lordmau5 commented 1 year ago

Kim_vocal_1 This model extracts accompaniment with vocal residue

It still produces great results - the idea of this is that you combine multiple models so you can clean up that vocal residue by the other ones.


Just tried it on a Kurzgesagt video and it definitely produced a MUCH better result than a single model, thank you so much for this guide, @thenormal ♥️

@Anjok07 would it be possible to pin this issue just in case so it won't go under in the future if more issues are being made? Just an idea 😁

ThunderMite42 commented 1 year ago

I keep seeing these numbers "406+427" (and occasionally 292 and 496) in the algorithm name, but I can't find anything corresponding to said numbers in the download center or anywhere else on the site except for these pages (example). What do they mean?

thenormal commented 1 year ago

I keep seeing these numbers "406+427" (and occasionally 292 and 496) in the algorithm name, but I can't find anything corresponding to said numbers in the download center or anywhere else on the site except for these pages (example). What do they mean?

On UVR you can download additional models. Those numbers mean they used two or more models (in this case, 406 and 427) using the Ensemble function to create the instrumental. Join the dedicated Discord server for all the info: https://discord.gg/zZ9CbEn5HE

ThunderMite42 commented 1 year ago

I keep seeing these numbers "406+427" (and occasionally 292 and 496) in the algorithm name, but I can't find anything corresponding to said numbers in the download center or anywhere else on the site except for these pages (example). What do they mean?

On UVR you can download additional models. Those numbers mean they used two or more models (in this case, 406 and 427) using the Ensemble function to create the instrumental. Join the dedicated Discord server for all the info: https://discord.gg/zZ9CbEn5HE

I'm in the download center right now and nowhere do I see anything with those numbers in them, nor do I see anywhere to type them in. The only thing there that lets me type anything is the download code window, and putting in the numbers just yields a "code incorrect" dialog.

EDIT: Nevermind, those are VIP models. I found the code in the server (albeit after way too much searching since it's not in the FAQ anywhere).

ybhka2022 commented 1 year ago

该表的目的是用于获得干净的器乐或干净的人声吗? 因为我想要一个干净的人声,而一个干净的人声对应的器乐可能会在器乐上留下一些人声但很干净,但是很干净的器乐可能对应的acapella不是那么干净。

按 SDR Vocals 对列表进行排序,找到得分最高的模型进行实验 MDX-Net: Where to download 292, 496, 406, 427. no download center

thenormal commented 1 year ago

该表的目的是用于获得干净的器乐或干净的人声吗? 因为我想要一个干净的人声,而一个干净的人声对应的器乐可能会在器乐上留下一些人声但很干净,但是很干净的器乐可能对应的acapella不是那么干净。

按 SDR Vocals 对列表进行排序,找到得分最高的模型进行实验 MDX-Net: Where to download 292, 496, 406, 427. no download center

You will likely need a VIP code. Join the Discord server and ask there: https://discord.gg/zZ9CbEn5HE

ybhka2022 commented 1 year ago

该表的目的是为了获得干净的器乐还是干净的人声吗? 因为我想要一个干净的人声,而一个干净的人声对应的器具可能会在器乐上留下一些人声但很干净,但是很干净的器具可能对应的acapella不是那么干净。

按SDR Vocals对列表进行排序,找到得到最高分的模型进行实验 MDX-Net:292、496、406、427下载地址。没有下载中心

您可能需要一个 VIP 代码。加入 Discord 服务器并在那里询问:https://discord.gg/zZ9CbEn5HE

Thank you very much for your help, other models can be downloaded, but the 496 model is not seen

thenormal commented 1 year ago

该表的目的是为了获得干净的器乐还是干净的人声吗? 因为我想要一个干净的人声,而一个干净的人声对应的器具可能会在器乐上留下一些人声但很干净,但是很干净的器具可能对应的acapella不是那么干净。

按SDR Vocals对列表进行排序,找到得到最高分的模型进行实验 MDX-Net:292、496、406、427下载地址。没有下载中心

您可能需要一个 VIP 代码。加入 Discord 服务器并在那里询问:https://discord.gg/zZ9CbEn5HE

Thank you very much for your help, other models can be downloaded, but the 496 model is not seen

You may have to download it separately because it is probably not publicly available. Like I said, join the Discord server and ask there. Someone will help.

ybhka2022 commented 1 year ago

非常感谢您发布此内容!

Hello: When will the MDX23C model be added? 微信截图_20230717202347

0xdevalias commented 1 year ago

When will the MDX23C model be added?

From Discord:

image

See also:

jeengbe commented 9 months ago

Can we pin this issue?

DampexUS commented 5 months ago

Can someone explain me better? With an capture or video about settings? I'm new with this stuff and i can't understand anything please