nomadkaraoke / python-audio-separator

Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)
MIT License
502 stars 83 forks source link

New Roformer Models #145

Closed Bebra777228 closed 2 weeks ago

Bebra777228 commented 2 weeks ago

4 days ago in this release: all_public_uvr_models, 3 new models were added:

Will they be added to audio-separator?

beveradb commented 2 weeks ago

They already are!

audio-separator -l

    "MDXC": {
        "MDX23C Model VIP: MDX23C-InstVoc HQ 2": {
            "MDX23C-8KFFT-InstVoc_HQ_2.ckpt": "model_2_stem_full_band_8k.yaml"
        },
        "MDX23C Model VIP: MDX23C_D1581": {
            "MDX23C_D1581.ckpt": "model_2_stem_061321.yaml"
        },
        "MDX23C Model: MDX23C-InstVoc HQ": {
            "MDX23C-8KFFT-InstVoc_HQ.ckpt": "model_2_stem_full_band_8k.yaml"
        },
        "Roformer Model: BS-Roformer-De-Reverb": {
            "deverb_bs_roformer_8_384dim_10depth.ckpt": "deverb_bs_roformer_8_384dim_10depth_config.yaml"
        },
        "Roformer Model: BS-Roformer-Viperx-1053": {
            "model_bs_roformer_ep_937_sdr_10.5309.ckpt": "model_bs_roformer_ep_937_sdr_10.5309.yaml"
        },
        "Roformer Model: BS-Roformer-Viperx-1296": {
            "model_bs_roformer_ep_368_sdr_12.9628.ckpt": "model_bs_roformer_ep_368_sdr_12.9628.yaml"
        },
        "Roformer Model: BS-Roformer-Viperx-1297": {
            "model_bs_roformer_ep_317_sdr_12.9755.ckpt": "model_bs_roformer_ep_317_sdr_12.9755.yaml"
        },
        "Roformer Model: MB-Roformer-Inst-v1 by Kim": {
            "melband_roformer_inst_v1.ckpt": "config_melbandroformer_inst.yaml"
        },
        "Roformer Model: MB-Roformer-InstVoc-Duality-v1 by Unwa": {
            "melband_roformer_instvoc_duality_v1.ckpt": "config_melbandroformer_instvoc_duality.yaml"
        },
        "Roformer Model: MB-Roformer-InstVoc-Duality-v2 by Unwa": {
            "melband_roformer_instvox_duality_v2.ckpt": "config_melbandroformer_instvoc_duality.yaml"
        },
        "Roformer Model: Mel-Roformer-Crowd-Aufr33-Viperx": {
            "mel_band_roformer_crowd_aufr33_viperx_sdr_8.7144.ckpt": "mel_band_roformer_crowd_aufr33_viperx_sdr_8.7144_config.yaml"
        },
        "Roformer Model: Mel-Roformer-Denoise-Aufr33": {
            "denoise_mel_band_roformer_aufr33_sdr_27.9959.ckpt": "denoise_mel_band_roformer_aufr33_sdr_27.9959_config.yaml"
        },
        "Roformer Model: Mel-Roformer-Denoise-Aufr33-Aggr": {
            "denoise_mel_band_roformer_aufr33_aggr_sdr_27.9768.ckpt": "denoise_mel_band_roformer_aufr33_aggr_sdr_27.9768_config.yaml"
        },
        "Roformer Model: Mel-Roformer-Karaoke-Aufr33-Viperx": {
            "mel_band_roformer_karaoke_aufr33_viperx_sdr_10.1956.ckpt": "mel_band_roformer_karaoke_aufr33_viperx_sdr_10.1956_config.yaml"
        },
        "Roformer Model: Mel-Roformer-Viperx-1143": {
            "model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt": "model_mel_band_roformer_ep_3005_sdr_11.4360.yaml"
        }
    },

The way I built it is so it pulls in the UVR config dynamically so if something gets added to UVR, it is automatically available in audio-separator too, so no action required here 😄

Thanks for the heads up though, I haven't tested those models so will give them a try later on :)

Bebra777228 commented 2 weeks ago

Cool! I'll try to run them now :)

beveradb commented 2 weeks ago

If they're good or offer something unique compared to the models I recommend already in this post, please share your findings 🙏

https://github.com/nomadkaraoke/python-audio-separator/discussions/133

Bebra777228 commented 2 weeks ago

I'm not the best at writing test results, but here's what I found out for myself. :)

I compared the following models:

First off, I didn't hear much of a difference between MB-Roformer-InstVoc-Duality-v1 by Unwa and MB-Roformer-InstVoc-Duality-v2 by Unwa, so I'll talk about MB-Roformer-InstVoc-Duality-v2 by Unwa.

MB-Roformer-InstVoc-Duality-v2 by Unwa sounds a bit better than BS-Roformer-Viperx-1297, with clearer vocal separation, but it leaves some background noise.

MB-Roformer-Inst-v1 by Kim leaves a lot of noise, with instrumental sounds in the vocals and vice versa.

My personal ratings on a 10-point scale:

BS-Roformer-Viperx-1297                    - 8/10
Mel-Roformer-Viperx-1143                   - 7/10
MB-Roformer-Inst-v1 by Kim                 - 5/10
MB-Roformer-InstVoc-Duality-v1 by Unwa     - 9/10
MB-Roformer-InstVoc-Duality-v2 by Unwa     - 9/10

I'd give MB-Roformer-InstVoc-Duality-v2 by Unwa a 10 out of 10 if it didn't have background noise. I think it's a great model (personally for me)!


I tested these models on just one audio file, so the results might be different on other audio. It's necessary to conduct further testing to get a more accurate and comprehensive understanding of their performance. One test is not enough to draw definitive conclusions.

beveradb commented 2 weeks ago

Nice, thanks for sharing! I'll check out the duality V2 model myself on some of my tracks later, if it's good enough to replace the current default model I might swap it out