nomadkaraoke / python-audio-separator

Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)
MIT License
444 stars 78 forks source link

Supporting VR Architecture models like HP5 #37

Closed KevinWang676 closed 8 months ago

KevinWang676 commented 8 months ago

Hi, I wonder if you are going to support VR Architecture models like HP5 as well. Thanks!

zhzhongshi commented 8 months ago

36

beveradb commented 8 months ago

Thanks @zhzhongshi - yep, I've literally been working on this all week and released audio-separator verison 0.14 earlier today! 😅

Please give it a try and see if it works for you!

If you confirm it works, I'll close this issue 🙏

That said - I'm still working on documentation, tests and some packaging issues (conda build failed, sigh) but the package on PyPI should "just work".

FYI, there's a new CLI parameter audio-separator --list_models which just prints all the models which are supported out of the box!

zhzhongshi commented 8 months ago

Thanks @zhzhongshi - yep, I've literally been working on this all week and released audio-separator verison 0.14 earlier today! 😅

Please give it a try and see if it works for you!

If you confirm it works, I'll close this issue 🙏

That said - I'm still working on documentation, tests and some packaging issues (conda build failed, sigh) but the package on PyPI should "just work".

FYI, there's a new CLI parameter audio-separator --list_models which just prints all the models which are supported out of the box!

I tried the steps below, but it doesn't seem to work. python -m venv .venv log:

(.venv) C:\project\test\ds\uvr>pip install audio-separator[gpu] -U
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Collecting audio-separator[gpu]
  Using cached https://mirrors.aliyun.com/pypi/packages/15/16/50b7df8bab253509a3005c47baf415b2d22d5f05a2690355265f73e0ffc5/audio_separator-0.14.0-py3-none-any.whl (81 kB)
Collecting tqdm
  Using cached https://mirrors.aliyun.com/pypi/packages/00/e5/f12a80907d0884e6dff9c16d0c0114d81b8cd07dc3ae54c5e962cc83037e/tqdm-4.66.1-py3-none-any.whl (78 kB)
Collecting torch
  Downloading https://mirrors.aliyun.com/pypi/packages/c8/ed/f11e9eb1e21d7ea8fc82a9fd373f9ff2023a7ee9e47d07c9bc9efce46eca/torch-2.2.0-cp310-cp310-win_amd64.whl (198.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 198.6/198.6 MB 1.5 MB/s eta 0:00:00
Collecting onnx2torch>=1.5
  Using cached https://mirrors.aliyun.com/pypi/packages/80/aa/0e86c52f7be3f8938cfe39cfb0f68fce31bb12c90d65f1884e6371d8a5fb/onnx2torch-1.5.13-py3-none-any.whl (78 kB)
Collecting six>=1.16
  Using cached https://mirrors.aliyun.com/pypi/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting requests>=2
  Using cached https://mirrors.aliyun.com/pypi/packages/70/8e/0e2d847013cb52cd35b38c009bb167a1a26b2ce6cd6965bf26b47bc0bf44/requests-2.31.0-py3-none-any.whl (62 kB)
Collecting librosa>=0.9
  Using cached https://mirrors.aliyun.com/pypi/packages/e2/a2/4f639c1168d7aada749a896afb4892a831e2041bebdcf636aebfe9e86556/librosa-0.10.1-py3-none-any.whl (253 kB)
Collecting numpy>=1.23
  Downloading https://mirrors.aliyun.com/pypi/packages/be/b0/611101990ddac767e54e2d27d1f4576ae1662cca64e2d55ef0e62558ec26/numpy-1.26.3-cp310-cp310-win_amd64.whl (15.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.8/15.8 MB 1.5 MB/s eta 0:00:00
Collecting pydub>=0.25
  Using cached https://mirrors.aliyun.com/pypi/packages/a6/53/d78dc063216e62fc55f6b2eebb447f6a4b0a59f55c8406376f76bf959b08/pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Collecting onnx>=1.14
  Using cached https://mirrors.aliyun.com/pypi/packages/1b/43/6e84bf35a9201fb344b1a98edf7095c4fa1cf1478dfb6638d7b77f5475e6/onnx-1.15.0-cp310-cp310-win_amd64.whl (14.3 MB)
Collecting onnxruntime-gpu
  Using cached https://mirrors.aliyun.com/pypi/packages/4d/10/707859693448a4acb9d9238034cee6c7d1eb79ca10b7930b3bcaf6ce231d/onnxruntime_gpu-1.17.0-cp310-cp310-win_amd64.whl (148.6 MB)
Collecting joblib>=0.14
  Downloading https://mirrors.aliyun.com/pypi/packages/10/40/d551139c85db202f1f384ba8bcf96aca2f329440a844f924c8a0040b6d02/joblib-1.3.2-py3-none-any.whl (302 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 302.2/302.2 kB 3.7 MB/s eta 0:00:00
Collecting soundfile>=0.12.1
  Using cached https://mirrors.aliyun.com/pypi/packages/50/ff/26a4ee48d0b66625a4e4028a055b9f25bc9d7c7b2d17d21a45137621a50d/soundfile-0.12.1-py2.py3-none-win_amd64.whl (1.0 MB)
Collecting soxr>=0.3.2
  Using cached https://mirrors.aliyun.com/pypi/packages/3c/e7/89951b917600d02f7389e760696c32b70a80a96301d0a018a70a317e4ecc/soxr-0.3.7-cp310-cp310-win_amd64.whl (184 kB)
Collecting scipy>=1.2.0
  Using cached https://mirrors.aliyun.com/pypi/packages/fd/a7/5f829b100d208c85163aecba93faf01d088d944fc91585338751d812f1e4/scipy-1.12.0-cp310-cp310-win_amd64.whl (46.2 MB)
Collecting scikit-learn>=0.20.0
  Using cached https://mirrors.aliyun.com/pypi/packages/bd/7e/52676c85bab788e0cb87b58e11ab53ba08e590c0db30642dd3222b702c73/scikit_learn-1.4.0-1-cp310-cp310-win_amd64.whl (10.6 MB)
Collecting msgpack>=1.0
  Using cached https://mirrors.aliyun.com/pypi/packages/4b/14/c62fbc8dff118f1558e43b9469d56a1f37bbb35febadc3163efaedd01500/msgpack-1.0.7-cp310-cp310-win_amd64.whl (222 kB)
Collecting numba>=0.51.0
  Using cached https://mirrors.aliyun.com/pypi/packages/a2/5c/8bf0c2cbd9d0d3b519213031a8b6fb71f7403b9e6ee4b4d16b74b9659bdf/numba-0.59.0-cp310-cp310-win_amd64.whl (2.7 MB)
Collecting pooch>=1.0
  Using cached https://mirrors.aliyun.com/pypi/packages/1a/a5/5174dac3957ac412e80a00f30b6507031fcab7000afc9ea0ac413bddcff2/pooch-1.8.0-py3-none-any.whl (62 kB)
Collecting decorator>=4.3.0
  Using cached https://mirrors.aliyun.com/pypi/packages/d5/50/83c593b07763e1161326b3b8c6686f0f4b0f24d5526546bee538c89837d6/decorator-5.1.1-py3-none-any.whl (9.1 kB)
Collecting typing-extensions>=4.1.1
  Downloading https://mirrors.aliyun.com/pypi/packages/b7/f4/6a90020cd2d93349b442bfcb657d0dc91eee65491600b2cb1d388bc98e6b/typing_extensions-4.9.0-py3-none-any.whl (32 kB)
Collecting lazy-loader>=0.1
  Using cached https://mirrors.aliyun.com/pypi/packages/a1/c3/65b3814e155836acacf720e5be3b5757130346670ac454fee29d3eda1381/lazy_loader-0.3-py3-none-any.whl (9.1 kB)
Collecting audioread>=2.1.9
  Using cached https://mirrors.aliyun.com/pypi/packages/57/8d/30aa32745af16af0a9a650115fbe81bde7c610ed5c21b381fca0196f3a7f/audioread-3.0.1-py3-none-any.whl (23 kB)
Collecting protobuf>=3.20.2
  Using cached https://mirrors.aliyun.com/pypi/packages/c1/00/c3ae19cabb36cfabc94ff0b102aac21b471c9f91a1357f8aafffb9efe8e0/protobuf-4.25.2-cp310-abi3-win_amd64.whl (413 kB)
Collecting torchvision>=0.9.0
  Downloading https://mirrors.aliyun.com/pypi/packages/7f/c9/10ca7837d786f2a96328ddf3a93767897d5e6eb04cf42b043778a771d04a/torchvision-0.17.0-cp310-cp310-win_amd64.whl (1.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 79.5 kB/s eta 0:00:00
Collecting certifi>=2017.4.17
  Using cached https://mirrors.aliyun.com/pypi/packages/ba/06/a07f096c664aeb9f01624f858c3add0a4e913d6c96257acb4fce61e7de14/certifi-2024.2.2-py3-none-any.whl (163 kB)
Collecting urllib3<3,>=1.21.1
  Using cached https://mirrors.aliyun.com/pypi/packages/88/75/311454fd3317aefe18415f04568edc20218453b709c63c58b9292c71be17/urllib3-2.2.0-py3-none-any.whl (120 kB)
Collecting charset-normalizer<4,>=2
  Downloading https://mirrors.aliyun.com/pypi/packages/a2/a0/4af29e22cb5942488cf45630cbdd7cefd908768e69bdd90280842e4e8529/charset_normalizer-3.3.2-cp310-cp310-win_amd64.whl (100 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.3/100.3 kB 62.7 kB/s eta 0:00:00
Collecting idna<4,>=2.5
  Using cached https://mirrors.aliyun.com/pypi/packages/c2/e7/a82b05cf63a603df6e68d59ae6a68bf5064484a0718ea5033660af4b54a9/idna-3.6-py3-none-any.whl (61 kB)
Collecting jinja2
  Using cached https://mirrors.aliyun.com/pypi/packages/30/6d/6de6be2d02603ab56e72997708809e8a5b0fbfee080735109b40a3564843/Jinja2-3.1.3-py3-none-any.whl (133 kB)
Collecting sympy
  Using cached https://mirrors.aliyun.com/pypi/packages/d2/05/e6600db80270777c4a64238a98d442f0fd07cc8915be2a1c16da7f2b9e74/sympy-1.12-py3-none-any.whl (5.7 MB)
Collecting filelock
  Using cached https://mirrors.aliyun.com/pypi/packages/81/54/84d42a0bee35edba99dee7b59a8d4970eccdd44b99fe728ed912106fc781/filelock-3.13.1-py3-none-any.whl (11 kB)
Collecting fsspec
  Using cached https://mirrors.aliyun.com/pypi/packages/ad/30/2281c062222dc39328843bd1ddd30ff3005ef8e30b2fd09c4d2792766061/fsspec-2024.2.0-py3-none-any.whl (170 kB)
Collecting networkx
  Using cached https://mirrors.aliyun.com/pypi/packages/d5/f0/8fbc882ca80cf077f1b246c0e3c3465f7f415439bdea6b899f6b19f61f70/networkx-3.2.1-py3-none-any.whl (1.6 MB)
Collecting coloredlogs
  Using cached https://mirrors.aliyun.com/pypi/packages/a7/06/3d6badcf13db419e25b07041d9c7b4a2c331d3f4e7134445ec5df57714cd/coloredlogs-15.0.1-py2.py3-none-any.whl (46 kB)
Collecting flatbuffers
  Using cached https://mirrors.aliyun.com/pypi/packages/6f/12/d5c79ee252793ffe845d58a913197bfa02ae9a0b5c9bc3dc4b58d477b9e7/flatbuffers-23.5.26-py2.py3-none-any.whl (26 kB)
Collecting packaging
  Downloading https://mirrors.aliyun.com/pypi/packages/ec/1a/610693ac4ee14fcdf2d9bf3c493370e4f2ef7ae2e19217d7a237ff42367d/packaging-23.2-py3-none-any.whl (53 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.0/53.0 kB 53.7 kB/s eta 0:00:00
Collecting colorama
  Using cached https://mirrors.aliyun.com/pypi/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Collecting llvmlite<0.43,>=0.42.0dev0
  Using cached https://mirrors.aliyun.com/pypi/packages/e0/a2/70e18cab31b707ff62c5dd4f5ed6ea88f553ba3a8e40ce99aefb8e056af1/llvmlite-0.42.0-cp310-cp310-win_amd64.whl (28.1 MB)
Collecting platformdirs>=2.5.0
  Using cached https://mirrors.aliyun.com/pypi/packages/55/72/4898c44ee9ea6f43396fbc23d9bfaf3d06e01b83698bdf2e4c919deceb7c/platformdirs-4.2.0-py3-none-any.whl (17 kB)
Collecting threadpoolctl>=2.0.0
  Downloading https://mirrors.aliyun.com/pypi/packages/81/12/fd4dea011af9d69e1cad05c75f3f7202cdcbeac9b712eea58ca779a72865/threadpoolctl-3.2.0-py3-none-any.whl (15 kB)
Collecting cffi>=1.0
  Using cached https://mirrors.aliyun.com/pypi/packages/be/3e/0b197d1bfbf386a90786b251dbf2634a15f2ea3d4e4070e99c7d1c7689cf/cffi-1.16.0-cp310-cp310-win_amd64.whl (181 kB)
Collecting pillow!=8.3.*,>=5.3.0
  Downloading https://mirrors.aliyun.com/pypi/packages/ef/d8/f97270d25a003435e408e6d1e38d8eddc9b3e2c7b646719f4b3a5293685d/pillow-10.2.0-cp310-cp310-win_amd64.whl (2.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.6/2.6 MB 154.2 kB/s eta 0:00:00
Collecting humanfriendly>=9.1
  Using cached https://mirrors.aliyun.com/pypi/packages/f0/0f/310fb31e39e2d734ccaa2c0fb981ee41f7bd5056ce9bc29b2248bd569169/humanfriendly-10.0-py2.py3-none-any.whl (86 kB)
Collecting MarkupSafe>=2.0
  Using cached https://mirrors.aliyun.com/pypi/packages/69/48/acbf292615c65f0604a0c6fc402ce6d8c991276e16c80c46a8f758fbd30c/MarkupSafe-2.1.5-cp310-cp310-win_amd64.whl (17 kB)
Collecting mpmath>=0.19
  Using cached https://mirrors.aliyun.com/pypi/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl (536 kB)
Collecting pycparser
  Downloading https://mirrors.aliyun.com/pypi/packages/62/d5/5f610ebe421e85889f2e55e33b7f9a6795bd982198517d912eb1c76e1a53/pycparser-2.21-py2.py3-none-any.whl (118 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 118.7/118.7 kB 105.2 kB/s eta 0:00:00
Collecting pyreadline3
  Using cached https://mirrors.aliyun.com/pypi/packages/56/fc/a3c13ded7b3057680c8ae95a9b6cc83e63657c38e0005c400a5d018a33a7/pyreadline3-3.4.1-py3-none-any.whl (95 kB)
Installing collected packages: pyreadline3, pydub, mpmath, flatbuffers, urllib3, typing-extensions, threadpoolctl, sympy, six, pycparser, protobuf, platformdirs, pillow, packaging, numpy, networkx, msgpack, MarkupSafe, llvmlite, lazy-loader, joblib, idna, humanfriendly, fsspec, filelock, decorator, colorama, charset-normalizer, certifi, audioread, tqdm, soxr, scipy, requests, onnx, numba, jinja2, coloredlogs, cffi, torch, soundfile, scikit-learn, pooch, onnxruntime-gpu, torchvision, librosa, onnx2torch, audio-separator
Successfully installed MarkupSafe-2.1.5 audio-separator-0.14.0 audioread-3.0.1 certifi-2024.2.2 cffi-1.16.0 charset-normalizer-3.3.2 colorama-0.4.6 coloredlogs-15.0.1 decorator-5.1.1 filelock-3.13.1 flatbuffers-23.5.26 fsspec-2024.2.0 humanfriendly-10.0 idna-3.6 jinja2-3.1.3 joblib-1.3.2 lazy-loader-0.3 librosa-0.10.1 llvmlite-0.42.0 mpmath-1.3.0 msgpack-1.0.7 networkx-3.2.1 numba-0.59.0 numpy-1.26.3 onnx-1.15.0 onnx2torch-1.5.13 onnxruntime-gpu-1.17.0 packaging-23.2 pillow-10.2.0 platformdirs-4.2.0 pooch-1.8.0 protobuf-4.25.2 pycparser-2.21 pydub-0.25.1 pyreadline3-3.4.1 requests-2.31.0 scikit-learn-1.4.0 scipy-1.12.0 six-1.16.0 soundfile-0.12.1 soxr-0.3.7 sympy-1.12 threadpoolctl-3.2.0 torch-2.2.0 torchvision-0.17.0 tqdm-4.66.1 typing-extensions-4.9.0 urllib3-2.2.0

[notice] A new release of pip is available: 23.0.1 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip

It seems like there's a dependency missing.

(.venv) C:\project\test\ds\uvr>audio-separator --list_models
Traceback (most recent call last):
  File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\project\test\ds\uvr\.venv\Scripts\audio-separator.exe\__main__.py", line 7, in <module>
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\audio_separator\utils\cli.py", line 76, in main
    from audio_separator.separator import Separator
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\audio_separator\separator\__init__.py", line 1, in <module>
    from .separator import Separator
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\audio_separator\separator\separator.py", line 15, in <module>
    from audio_separator.separator.architectures import MDXSeparator, VRSeparator
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\audio_separator\separator\architectures\__init__.py", line 1, in <module>
    from .mdx_separator import MDXSeparator
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\audio_separator\separator\architectures\mdx_separator.py", line 10, in <module>
    from audio_separator.separator.uvr_lib_v5 import spec_utils
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\audio_separator\separator\uvr_lib_v5\spec_utils.py", line 31, in <module>
    from pyrubberband import pyrb
ModuleNotFoundError: No module named 'pyrubberband'

install it.


(.venv) C:\project\test\ds\uvr>pip install pyrubberband
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Collecting pyrubberband
  Using cached https://mirrors.aliyun.com/pypi/packages/66/03/079d3adead19dc0af62a02c7850318055aa2dec9453c92886219b642b904/pyrubberband-0.3.0.tar.gz (4.1 kB)
  Preparing metadata (setup.py) ... done
Requirement already satisfied: six in c:\project\test\ds\uvr\.venv\lib\site-packages (from pyrubberband) (1.16.0)
Collecting pysoundfile>=0.8.0
  Using cached https://mirrors.aliyun.com/pypi/packages/9d/8e/30d9f80802e8ea2c5b96db2f320fdb147e25a84fd74caa251b57bedeeb33/PySoundFile-0.9.0.post1-py2.py3.cp26.cp27.cp32.cp33.cp34.cp35.cp36.pp27.pp32.pp33-none-win_amd64.whl (671 kB)
Requirement already satisfied: cffi>=0.6 in c:\project\test\ds\uvr\.venv\lib\site-packages (from pysoundfile>=0.8.0->pyrubberband) (1.16.0)
Requirement already satisfied: pycparser in c:\project\test\ds\uvr\.venv\lib\site-packages (from cffi>=0.6->pysoundfile>=0.8.0->pyrubberband) (2.21)
Installing collected packages: pysoundfile, pyrubberband
  DEPRECATION: pyrubberband is being installed using the legacy 'setup.py install' method, because it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at https://github.com/pypa/pip/issues/8559
  Running setup.py install for pyrubberband ... done
Successfully installed pyrubberband-0.3.0 pysoundfile-0.9.0.post1

[notice] A new release of pip is available: 23.0.1 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip

try again


(.venv) C:\project\test\ds\uvr>audio-separator --list_models
2024-02-05 15:46:35,397 - INFO - separator - Separator version 0.14.0 instantiating with output_dir: None, output_format: WAV
2024-02-05 15:46:35,397 - DEBUG - separator - Normalization threshold set to 0.9, waveform will lowered to this max amplitude to avoid clipping.
2024-02-05 15:46:35,397 - DEBUG - separator - Denoising disabled, model will only be run once. This is twice as fast, but may result in noisier output audio.
2024-02-05 15:46:35,397 - INFO - separator - Operating System: Windows 10.0.22621
2024-02-05 15:46:35,397 - INFO - separator - System: Windows Node: DESKTOP-R5JDPUC Release: 10 Machine: AMD64 Proc: Intel64 Family 6 Model 154 Stepping 3, GenuineIntel
2024-02-05 15:46:35,397 - INFO - separator - Python Version: 3.10.11
2024-02-05 15:46:35,397 - DEBUG - separator - Python package: onnxruntime-silicon not installed
2024-02-05 15:46:35,397 - DEBUG - separator - Python package: onnxruntime not installed
2024-02-05 15:46:35,397 - INFO - separator - ONNX Runtime GPU package installed with version: 1.17.0
2024-02-05 15:46:35,397 - INFO - separator - No hardware acceleration could be configured, running in CPU mode
2024-02-05 15:46:35,397 - DEBUG - separator - Downloading file from https://raw.githubusercontent.com/TRvlvr/application_data/main/filelists/download_checks.json to /tmp/audio-separator-models/download_checks.json with timeout 300s
2024-02-05 15:46:36,734 - DEBUG - separator - Model download list loaded: {'current_version': 'UVR_Patch_10_6_23_4_27', 'current_version_ocl': 'UVR_Patch_10_6_23_4_27', 'current_version_mac': 'UVR_Patch_10_6_23_4_27', 'current_version_linux': 'UVR_Patch_10_6_23_4_27', 'vr_download_list': {'VR Arch Single Model v5: 1_HP-UVR': '1_HP-UVR.pth', 'VR Arch Single Model v5: 2_HP-UVR': '2_HP-UVR.pth', 'VR Arch Single Model v5: 3_HP-Vocal-UVR': '3_HP-Vocal-UVR.pth', 'VR Arch Single Model v5: 4_HP-Vocal-UVR': '4_HP-Vocal-UVR.pth', 'VR Arch Single Model v5: 5_HP-Karaoke-UVR': '5_HP-Karaoke-UVR.pth', 'VR Arch Single Model v5: 6_HP-Karaoke-UVR': '6_HP-Karaoke-UVR.pth', 'VR Arch Single Model v5: 7_HP2-UVR': '7_HP2-UVR.pth', 'VR Arch Single Model v5: 8_HP2-UVR': '8_HP2-UVR.pth', 'VR Arch Single Model v5: 9_HP2-UVR': '9_HP2-UVR.pth', 'VR Arch Single Model v5: 10_SP-UVR-2B-32000-1': '10_SP-UVR-2B-32000-1.pth', 'VR Arch Single Model v5: 11_SP-UVR-2B-32000-2': '11_SP-UVR-2B-32000-2.pth', 'VR Arch Single Model v5: 12_SP-UVR-3B-44100': '12_SP-UVR-3B-44100.pth', 'VR Arch Single Model v5: 13_SP-UVR-4B-44100-1': '13_SP-UVR-4B-44100-1.pth', 'VR Arch Single Model v5: 14_SP-UVR-4B-44100-2': '14_SP-UVR-4B-44100-2.pth', 'VR Arch Single Model v5: 15_SP-UVR-MID-44100-1': '15_SP-UVR-MID-44100-1.pth', 'VR Arch Single Model v5: 16_SP-UVR-MID-44100-2': '16_SP-UVR-MID-44100-2.pth', 'VR Arch Single Model v5: 17_HP-Wind_Inst-UVR': '17_HP-Wind_Inst-UVR.pth', 'VR Arch Single Model v5: UVR-De-Echo-Aggressive by FoxJoy': 'UVR-De-Echo-Aggressive.pth', 'VR Arch Single Model v5: UVR-De-Echo-Normal by FoxJoy': 'UVR-De-Echo-Normal.pth', 'VR Arch Single Model v5: UVR-DeEcho-DeReverb by FoxJoy': 'UVR-DeEcho-DeReverb.pth', 'VR Arch Single Model v5: UVR-DeNoise-Lite by FoxJoy': 'UVR-DeNoise-Lite.pth', 'VR Arch Single Model v5: UVR-DeNoise by FoxJoy': 'UVR-DeNoise.pth', 'VR Arch Single Model v5: UVR-BVE-4B_SN-44100-1': 'UVR-BVE-4B_SN-44100-1.pth', 'VR Arch Single Model v4: MGM_HIGHEND_v4': 'MGM_HIGHEND_v4.pth', 'VR Arch Single Model v4: MGM_LOWEND_A_v4': 'MGM_LOWEND_A_v4.pth', 'VR Arch Single Model v4: MGM_LOWEND_B_v4': 'MGM_LOWEND_B_v4.pth', 'VR Arch Single Model v4: MGM_MAIN_v4': 'MGM_MAIN_v4.pth'}, 'mdx_download_list': {'MDX-Net Model: UVR-MDX-NET Inst HQ 1': 'UVR-MDX-NET-Inst_HQ_1.onnx', 'MDX-Net Model: UVR-MDX-NET Inst HQ 2': 'UVR-MDX-NET-Inst_HQ_2.onnx', 'MDX-Net Model: UVR-MDX-NET Inst HQ 3': 'UVR-MDX-NET-Inst_HQ_3.onnx', 'MDX-Net Model: UVR-MDX-NET Main': 'UVR_MDXNET_Main.onnx', 'MDX-Net Model: UVR-MDX-NET Inst Main': 'UVR-MDX-NET-Inst_Main.onnx', 'MDX-Net Model: UVR-MDX-NET 1': 'UVR_MDXNET_1_9703.onnx', 'MDX-Net Model: UVR-MDX-NET 2': 'UVR_MDXNET_2_9682.onnx', 'MDX-Net Model: UVR-MDX-NET 3': 'UVR_MDXNET_3_9662.onnx', 'MDX-Net Model: UVR-MDX-NET Inst 1': 'UVR-MDX-NET-Inst_1.onnx', 'MDX-Net Model: UVR-MDX-NET Inst 2': 'UVR-MDX-NET-Inst_2.onnx', 'MDX-Net Model: UVR-MDX-NET Inst 3': 'UVR-MDX-NET-Inst_3.onnx', 'MDX-Net Model: UVR-MDX-NET Karaoke': 'UVR_MDXNET_KARA.onnx', 'MDX-Net Model: UVR-MDX-NET Karaoke 2': 'UVR_MDXNET_KARA_2.onnx', 'MDX-Net Model: UVR_MDXNET_9482': 'UVR_MDXNET_9482.onnx', 'MDX-Net Model: UVR-MDX-NET Voc FT': 'UVR-MDX-NET-Voc_FT.onnx', 'MDX-Net Model: Kim Vocal 1': 'Kim_Vocal_1.onnx', 'MDX-Net Model: Kim Vocal 2': 'Kim_Vocal_2.onnx', 'MDX-Net Model: Kim Inst': 'Kim_Inst.onnx', 'MDX-Net Model: Reverb HQ By FoxJoy': 'Reverb_HQ_By_FoxJoy.onnx', 'MDX-Net Model: kuielab_a_vocals': 'kuielab_a_vocals.onnx', 'MDX-Net Model: kuielab_a_other': 'kuielab_a_other.onnx', 'MDX-Net Model: kuielab_a_bass': 'kuielab_a_bass.onnx', 'MDX-Net Model: kuielab_a_drums': 'kuielab_a_drums.onnx', 'MDX-Net Model: kuielab_b_vocals': 'kuielab_b_vocals.onnx', 'MDX-Net Model: kuielab_b_other': 'kuielab_b_other.onnx', 'MDX-Net Model: kuielab_b_bass': 'kuielab_b_bass.onnx', 'MDX-Net Model: kuielab_b_drums': 'kuielab_b_drums.onnx'}, 'demucs_download_list': {'Demucs v4: htdemucs_ft': {'f7e0c4bc-ba3fe64a.th': 'https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/f7e0c4bc-ba3fe64a.th', 'd12395a8-e57c48e6.th': 'https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/d12395a8-e57c48e6.th', '92cfc3b6-ef3bcb9c.th': 'https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/92cfc3b6-ef3bcb9c.th', '04573f0d-f3cf25b2.th': 'https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/04573f0d-f3cf25b2.th', 'htdemucs_ft.yaml': 'https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/htdemucs_ft.yaml'}, 'Demucs v4: htdemucs': {'955717e8-8726e21a.th': 'https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th', 'htdemucs.yaml': 'https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/htdemucs.yaml'}, 'Demucs v4: hdemucs_mmi': {'75fc33f5-1941ce65.th': 'https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/75fc33f5-1941ce65.th', 'hdemucs_mmi.yaml': 'https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/hdemucs_mmi.yaml'}, 'Demucs v4: htdemucs_6s': {'5c90dfd2-34c22ccb.th': 'https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/5c90dfd2-34c22ccb.th', 'htdemucs_6s.yaml': 'https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/htdemucs_6s.yaml'}, 'Demucs v3: mdx': {'0d19c1c6-0f06f20e.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/0d19c1c6-0f06f20e.th', '7ecf8ec1-70f50cc9.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/7ecf8ec1-70f50cc9.th', 'c511e2ab-fe698775.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/c511e2ab-fe698775.th', '7d865c68-3d5dd56b.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/7d865c68-3d5dd56b.th', 'mdx.yaml': 'https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx.yaml'}, 'Demucs v3: mdx_q': {'6b9c2ca1-3fd82607.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/6b9c2ca1-3fd82607.th', 'b72baf4e-8778635e.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/b72baf4e-8778635e.th', '42e558d4-196e0e1b.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/42e558d4-196e0e1b.th', '305bc58f-18378783.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/305bc58f-18378783.th', 'mdx_q.yaml': 'https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx_q.yaml'}, 'Demucs v3: mdx_extra': {'e51eebcc-c1b80bdd.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/e51eebcc-c1b80bdd.th', 'a1d90b5c-ae9d2452.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/a1d90b5c-ae9d2452.th', '5d2d6c55-db83574e.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/5d2d6c55-db83574e.th', 'cfa93e08-61801ae1.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/cfa93e08-61801ae1.th', 'mdx_extra.yaml': 'https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx_extra.yaml'}, 'Demucs v3: mdx_extra_q': {'83fc094f-4a16d450.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/83fc094f-4a16d450.th', '464b36d7-e5a9386e.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/464b36d7-e5a9386e.th', '14fc6a69-a89dd0ee.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/14fc6a69-a89dd0ee.th', '7fd6ef75-a905dd85.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/7fd6ef75-a905dd85.th', 'mdx_extra_q.yaml': 'https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx_extra_q.yaml'}, 'Demucs v3: UVR Model': {'ebf34a2db.th': 'https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/ebf34a2db.th', 'UVR_Demucs_Model_1.yaml': 'https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/UVR_Demucs_Model_1.yaml'}, 'Demucs v3: repro_mdx_a': {'9a6b4851-03af0aa6.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/9a6b4851-03af0aa6.th', '1ef250f1-592467ce.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/1ef250f1-592467ce.th', 'fa0cb7f9-100d8bf4.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/fa0cb7f9-100d8bf4.th', '902315c2-b39ce9c9.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/902315c2-b39ce9c9.th', 'repro_mdx_a.yaml': 'https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/repro_mdx_a.yaml'}, 'Demucs v3: repro_mdx_a_time_only': {'9a6b4851-03af0aa6.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/9a6b4851-03af0aa6.th', '1ef250f1-592467ce.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/1ef250f1-592467ce.th', 'repro_mdx_a_time_only.yaml': 'https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/repro_mdx_a_time_only.yaml'}, 'Demucs v3: repro_mdx_a_hybrid_only': {'fa0cb7f9-100d8bf4.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/fa0cb7f9-100d8bf4.th', '902315c2-b39ce9c9.th': 'https://dl.fbaipublicfiles.com/demucs/mdx_final/902315c2-b39ce9c9.th', 'repro_mdx_a_hybrid_only.yaml': 'https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/repro_mdx_a_hybrid_only.yaml'}, 'Demucs v2: demucs': {'demucs-e07c671f.th': 'https://dl.fbaipublicfiles.com/demucs/v3.0/demucs-e07c671f.th'}, 'Demucs v2: demucs_extra': {'demucs_extra-3646af93.th': 'https://dl.fbaipublicfiles.com/demucs/v3.0/demucs_extra-3646af93.th'}, 'Demucs v2: demucs48_hq': {'demucs48_hq-28a1282c.th': 'https://dl.fbaipublicfiles.com/demucs/v3.0/demucs48_hq-28a1282c.th'}, 'Demucs v2: tasnet': {'tasnet-beb46fac.th': 'https://dl.fbaipublicfiles.com/demucs/v3.0/tasnet-beb46fac.th'}, 'Demucs v2: tasnet_extra': {'tasnet_extra-df3777b2.th': 'https://dl.fbaipublicfiles.com/demucs/v3.0/tasnet_extra-df3777b2.th'}, 'Demucs v2: demucs_unittest': {'demucs_unittest-09ebc15f.th': 'https://dl.fbaipublicfiles.com/demucs/v3.0/demucs_unittest-09ebc15f.th'}, 'Demucs v1: demucs': {'demucs.th': 'https://dl.fbaipublicfiles.com/demucs/v2.0/demucs.th'}, 'Demucs v1: demucs_extra': {'demucs_extra.th': 'https://dl.fbaipublicfiles.com/demucs/v2.0/demucs_extra.th'}, 'Demucs v1: light': {'light.th': 'https://dl.fbaipublicfiles.com/demucs/v2.0/light.th'}, 'Demucs v1: light_extra': {'light_extra.th': 'https://dl.fbaipublicfiles.com/demucs/v2.0/light_extra.th'}, 'Demucs v1: tasnet': {'tasnet.th': 'https://dl.fbaipublicfiles.com/demucs/v2.0/tasnet.th'}, 'Demucs v1: tasnet_extra': {'tasnet_extra.th': 'https://dl.fbaipublicfiles.com/demucs/v2.0/tasnet_extra.th'}}, 'mdx_download_vip_list': {'MDX-Net Model VIP: UVR-MDX-NET_Main_340': 'UVR-MDX-NET_Main_340.onnx', 'MDX-Net Model VIP: UVR-MDX-NET_Main_390': 'UVR-MDX-NET_Main_390.onnx', 'MDX-Net Model VIP: UVR-MDX-NET_Main_406': 'UVR-MDX-NET_Main_406.onnx', 'MDX-Net Model VIP: UVR-MDX-NET_Main_427': 'UVR-MDX-NET_Main_427.onnx', 'MDX-Net Model VIP: UVR-MDX-NET_Main_438': 'UVR-MDX-NET_Main_438.onnx', 'MDX-Net Model VIP: UVR-MDX-NET_Inst_82_beta': 'UVR-MDX-NET_Inst_82_beta.onnx', 'MDX-Net Model VIP: UVR-MDX-NET_Inst_90_beta': 'UVR-MDX-NET_Inst_90_beta.onnx', 'MDX-Net Model VIP: UVR-MDX-NET_Inst_187_beta': 'UVR-MDX-NET_Inst_187_beta.onnx', 'MDX-Net Model VIP: UVR-MDX-NET-Inst_full_292': 'UVR-MDX-NET-Inst_full_292.onnx'}, 'mdx23_download_list': {'MDX23C Model: MDX23C_D1581': {'MDX23C_D1581.ckpt': 'model_2_stem_061321.yaml'}}, 'mdx23c_download_list': {'MDX23C Model: MDX23C-InstVoc HQ': {'MDX23C-8KFFT-InstVoc_HQ.ckpt': 'model_2_stem_full_band_8k.yaml'}}, 'mdx23c_download_vip_list': {'MDX23C Model VIP: MDX23C_D1581': {'MDX23C_D1581.ckpt': 'model_2_stem_061321.yaml'}, 'MDX23C Model VIP: MDX23C-InstVoc HQ 2': {'MDX23C-8KFFT-InstVoc_HQ_2.ckpt': 'model_2_stem_full_band_8k.yaml'}}, 'vr_download_vip_list': [], 'demucs_download_vip_list': []}
{
    "MDX": {
        "MDX-Net Model: Kim Inst": "Kim_Inst.onnx",
        "MDX-Net Model: Kim Vocal 1": "Kim_Vocal_1.onnx",
        "MDX-Net Model: Kim Vocal 2": "Kim_Vocal_2.onnx",
        "MDX-Net Model: Reverb HQ By FoxJoy": "Reverb_HQ_By_FoxJoy.onnx",
        "MDX-Net Model: UVR-MDX-NET 1": "UVR_MDXNET_1_9703.onnx",
        "MDX-Net Model: UVR-MDX-NET 2": "UVR_MDXNET_2_9682.onnx",
        "MDX-Net Model: UVR-MDX-NET 3": "UVR_MDXNET_3_9662.onnx",
        "MDX-Net Model: UVR-MDX-NET Inst 1": "UVR-MDX-NET-Inst_1.onnx",
        "MDX-Net Model: UVR-MDX-NET Inst 2": "UVR-MDX-NET-Inst_2.onnx",
        "MDX-Net Model: UVR-MDX-NET Inst 3": "UVR-MDX-NET-Inst_3.onnx",
        "MDX-Net Model: UVR-MDX-NET Inst HQ 1": "UVR-MDX-NET-Inst_HQ_1.onnx",
        "MDX-Net Model: UVR-MDX-NET Inst HQ 2": "UVR-MDX-NET-Inst_HQ_2.onnx",
        "MDX-Net Model: UVR-MDX-NET Inst HQ 3": "UVR-MDX-NET-Inst_HQ_3.onnx",
        "MDX-Net Model: UVR-MDX-NET Inst Main": "UVR-MDX-NET-Inst_Main.onnx",
        "MDX-Net Model: UVR-MDX-NET Karaoke": "UVR_MDXNET_KARA.onnx",
        "MDX-Net Model: UVR-MDX-NET Karaoke 2": "UVR_MDXNET_KARA_2.onnx",
        "MDX-Net Model: UVR-MDX-NET Main": "UVR_MDXNET_Main.onnx",
        "MDX-Net Model: UVR-MDX-NET Voc FT": "UVR-MDX-NET-Voc_FT.onnx",
        "MDX-Net Model: UVR_MDXNET_9482": "UVR_MDXNET_9482.onnx",
        "MDX-Net Model: kuielab_a_bass": "kuielab_a_bass.onnx",
        "MDX-Net Model: kuielab_a_drums": "kuielab_a_drums.onnx",
        "MDX-Net Model: kuielab_a_other": "kuielab_a_other.onnx",
        "MDX-Net Model: kuielab_a_vocals": "kuielab_a_vocals.onnx",
        "MDX-Net Model: kuielab_b_bass": "kuielab_b_bass.onnx",
        "MDX-Net Model: kuielab_b_drums": "kuielab_b_drums.onnx",
        "MDX-Net Model: kuielab_b_other": "kuielab_b_other.onnx",
        "MDX-Net Model: kuielab_b_vocals": "kuielab_b_vocals.onnx"
    },
    "VR": {
        "VR Arch Single Model v4: MGM_HIGHEND_v4": "MGM_HIGHEND_v4.pth",
        "VR Arch Single Model v4: MGM_LOWEND_A_v4": "MGM_LOWEND_A_v4.pth",
        "VR Arch Single Model v4: MGM_LOWEND_B_v4": "MGM_LOWEND_B_v4.pth",
        "VR Arch Single Model v4: MGM_MAIN_v4": "MGM_MAIN_v4.pth",
        "VR Arch Single Model v5: 10_SP-UVR-2B-32000-1": "10_SP-UVR-2B-32000-1.pth",
        "VR Arch Single Model v5: 11_SP-UVR-2B-32000-2": "11_SP-UVR-2B-32000-2.pth",
        "VR Arch Single Model v5: 12_SP-UVR-3B-44100": "12_SP-UVR-3B-44100.pth",
        "VR Arch Single Model v5: 13_SP-UVR-4B-44100-1": "13_SP-UVR-4B-44100-1.pth",
        "VR Arch Single Model v5: 14_SP-UVR-4B-44100-2": "14_SP-UVR-4B-44100-2.pth",
        "VR Arch Single Model v5: 15_SP-UVR-MID-44100-1": "15_SP-UVR-MID-44100-1.pth",
        "VR Arch Single Model v5: 16_SP-UVR-MID-44100-2": "16_SP-UVR-MID-44100-2.pth",
        "VR Arch Single Model v5: 17_HP-Wind_Inst-UVR": "17_HP-Wind_Inst-UVR.pth",
        "VR Arch Single Model v5: 1_HP-UVR": "1_HP-UVR.pth",
        "VR Arch Single Model v5: 2_HP-UVR": "2_HP-UVR.pth",
        "VR Arch Single Model v5: 3_HP-Vocal-UVR": "3_HP-Vocal-UVR.pth",
        "VR Arch Single Model v5: 4_HP-Vocal-UVR": "4_HP-Vocal-UVR.pth",
        "VR Arch Single Model v5: 5_HP-Karaoke-UVR": "5_HP-Karaoke-UVR.pth",
        "VR Arch Single Model v5: 6_HP-Karaoke-UVR": "6_HP-Karaoke-UVR.pth",
        "VR Arch Single Model v5: 7_HP2-UVR": "7_HP2-UVR.pth",
        "VR Arch Single Model v5: 8_HP2-UVR": "8_HP2-UVR.pth",
        "VR Arch Single Model v5: 9_HP2-UVR": "9_HP2-UVR.pth",
        "VR Arch Single Model v5: UVR-BVE-4B_SN-44100-1": "UVR-BVE-4B_SN-44100-1.pth",
        "VR Arch Single Model v5: UVR-De-Echo-Aggressive by FoxJoy": "UVR-De-Echo-Aggressive.pth",
        "VR Arch Single Model v5: UVR-De-Echo-Normal by FoxJoy": "UVR-De-Echo-Normal.pth",
        "VR Arch Single Model v5: UVR-DeEcho-DeReverb by FoxJoy": "UVR-DeEcho-DeReverb.pth",
        "VR Arch Single Model v5: UVR-DeNoise by FoxJoy": "UVR-DeNoise.pth",
        "VR Arch Single Model v5: UVR-DeNoise-Lite by FoxJoy": "UVR-DeNoise-Lite.pth"
    }
}

list works,start.


(.venv) C:\project\test\ds\uvr>audio-separator --help              
usage: audio-separator [-h] [-v] [--log_level LOG_LEVEL] [--list_models] [--model_filename MODEL_FILENAME] [--model_file_dir MODEL_FILE_DIR] [--output_dir OUTPUT_DIR] [--output_format OUTPUT_FORMAT]
                       [--denoise DENOISE] [--normalization_threshold NORMALIZATION_THRESHOLD] [--single_stem SINGLE_STEM] [--invert_spect INVERT_SPECT] [--sample_rate SAMPLE_RATE]
                       [--mdx_hop_length MDX_HOP_LENGTH] [--mdx_segment_size MDX_SEGMENT_SIZE] [--mdx_overlap MDX_OVERLAP] [--mdx_batch_size MDX_BATCH_SIZE] [--vr_batch_size VR_BATCH_SIZE]
                       [--vr_window_size VR_WINDOW_SIZE] [--vr_aggression VR_AGGRESSION] [--vr_enable_tta VR_ENABLE_TTA] [--vr_enable_post_process VR_ENABLE_POST_PROCESS]
                       [--vr_post_process_threshold VR_POST_PROCESS_THRESHOLD] [--vr_high_end_process VR_HIGH_END_PROCESS]
                       [audio_file]

Separate audio file into different stems.

positional arguments:
  audio_file                                 The audio file path to separate, in any common format.

options:
  -h, --help                                 show this help message and exit
  -v, --version                              show program's version number and exit
  --log_level LOG_LEVEL                      Optional: logging level, e.g. info, debug, warning (default: info). Example: --log_level=debug
  --list_models                              List all supported models and exit.
  --model_filename MODEL_FILENAME            Optional: model filename to be used for separation (default: 2_HP-UVR.pth). Example: --model_filename=UVR_MDXNET_KARA_2.onnx
  --model_file_dir MODEL_FILE_DIR            Optional: model files directory (default: /tmp/audio-separator-models/). Example: --model_file_dir=/app/models
  --output_dir OUTPUT_DIR                    Optional: directory to write output files (default: <current dir>). Example: --output_dir=/app/separated
  --output_format OUTPUT_FORMAT              Optional: output format for separated files, any common format (default: FLAC). Example: --output_format=MP3
  --denoise DENOISE                          Optional: enable or disable denoising during separation (default: False). Example: --denoise=True
  --normalization_threshold NORMALIZATION_THRESHOLD
                                             Optional: max peak amplitude to normalize input and output audio to (default: 0.9). Example: --normalization_threshold=0.7
  --single_stem SINGLE_STEM                  Optional: output only single stem, either instrumental or vocals. Example: --single_stem=instrumental
  --invert_spect INVERT_SPECT                Optional: invert secondary stem using spectogram (default: False). Example: --invert_spect=True
  --sample_rate SAMPLE_RATE                  Optional: sample_rate (default: 44100). Example: --sample_rate=44100
  --mdx_hop_length MDX_HOP_LENGTH            Optional: mdx_hop_length (default: 1024). Example: --mdx_hop_length=1024
  --mdx_segment_size MDX_SEGMENT_SIZE        Optional: mdx_segment_size (default: 256). Example: --mdx_segment_size=256
  --mdx_overlap MDX_OVERLAP                  Optional: mdx_overlap (default: 0.25). Example: --mdx_overlap=0.25
  --mdx_batch_size MDX_BATCH_SIZE            Optional: mdx_batch_size (default: 1). Example: --mdx_batch_size=4
  --vr_batch_size VR_BATCH_SIZE              Optional: vr_batch_size (default: 4). Example: --vr_batch_size=16
  --vr_window_size VR_WINDOW_SIZE            Optional: vr_window_size (default: 512). Example: --vr_window_size=256
  --vr_aggression VR_AGGRESSION              Optional: vr_aggression (default: 5). Example: --vr_aggression=2
  --vr_enable_tta VR_ENABLE_TTA              Optional: vr_enable_tta (default: False). Example: --vr_enable_tta=True
  --vr_enable_post_process VR_ENABLE_POST_PROCESS
                                             Optional: vr_enable_post_process (default: False). Example: --vr_enable_post_process=True
  --vr_post_process_threshold VR_POST_PROCESS_THRESHOLD
                                             Optional: vr_post_process_threshold (default: 0.2). Example: --vr_post_process_threshold=0.1
  --vr_high_end_process VR_HIGH_END_PROCESS  Optional: vr_high_end_process (default: False). Example: --vr_high_end_process=True

(.venv) C:\project\test\ds\uvr>audio-separator --log_level debug --model_filename UVR_MDXNET_Main.onnx --model_file_dir models/ 0822.wav
2024-02-05 15:48:25.514 - INFO - cli - Separator version 0.14.0 beginning with input file: 0822.wav
2024-02-05 15:48:29.682 - INFO - separator - Separator version 0.14.0 instantiating with output_dir: None, output_format: FLAC
2024-02-05 15:48:29.682 - DEBUG - separator - Normalization threshold set to 0.9, waveform will lowered to this max amplitude to avoid clipping.
2024-02-05 15:48:29.682 - DEBUG - separator - Denoising disabled, model will only be run once. This is twice as fast, but may result in noisier output audio.
2024-02-05 15:48:29.683 - INFO - separator - Operating System: Windows 10.0.22621
2024-02-05 15:48:29.683 - INFO - separator - System: Windows Node: DESKTOP-R5JDPUC Release: 10 Machine: AMD64 Proc: Intel64 Family 6 Model 154 Stepping 3, GenuineIntel
2024-02-05 15:48:29.683 - INFO - separator - Python Version: 3.10.11
2024-02-05 15:48:29.684 - DEBUG - separator - Python package: onnxruntime-silicon not installed
2024-02-05 15:48:29.684 - DEBUG - separator - Python package: onnxruntime not installed
2024-02-05 15:48:29.684 - INFO - separator - ONNX Runtime GPU package installed with version: 1.17.0
2024-02-05 15:48:29.684 - INFO - separator - No hardware acceleration could be configured, running in CPU mode
2024-02-05 15:48:29.684 - INFO - separator - Loading model UVR_MDXNET_Main.onnx...
2024-02-05 15:48:29.684 - DEBUG - separator - Model path set to models/UVR_MDXNET_Main.onnx
2024-02-05 15:48:29.684 - DEBUG - separator - Model not found at path models/UVR_MDXNET_Main.onnx, downloading...
2024-02-05 15:48:29.688 - DEBUG - separator - Downloading file from https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/UVR_MDXNET_Main.onnx to models/UVR_MDXNET_Main.onnx with timeout 300s
2024-02-05 15:48:55.241 - DEBUG - separator - Calculating MD5 hash for model file to identify model parameters from UVR data...
2024-02-05 15:48:55.241 - ERROR - separator - Attempting to calculate hash of model file models/UVR_MDXNET_Main.onnx
2024-02-05 15:48:55.297 - DEBUG - separator - Model models/UVR_MDXNET_Main.onnx has hash 53c4baf4d12c3e6c3831bb8f5b532b93
2024-02-05 15:48:55.298 - DEBUG - separator - VR model data path set to models/vr_model_data.json
2024-02-05 15:48:55.299 - DEBUG - separator - VR model data not found at path models/vr_model_data.json, downloading...
2024-02-05 15:48:55.299 - DEBUG - separator - Downloading file from https://raw.githubusercontent.com/TRvlvr/application_data/main/vr_model_data/model_data_new.json to models/vr_model_data.json with timeout 300s
2024-02-05 15:48:56.370 - DEBUG - separator - MDX model data path set to models/mdx_model_data.json
2024-02-05 15:48:56.373 - DEBUG - separator - MDX model data not found at path models/mdx_model_data.json, downloading...
2024-02-05 15:48:56.373 - DEBUG - separator - Downloading file from https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/model_data_new.json to models/mdx_model_data.json with timeout 300s
2024-02-05 15:48:57.362 - DEBUG - separator - Loading MDX and VR model parameters from UVR model data files...
2024-02-05 15:48:57.396 - DEBUG - separator - Model data loaded: {'compensate': 1.043, 'mdx_dim_f_set': 3072, 'mdx_dim_t_set': 8, 'mdx_n_fft_scale_set': 7680, 'primary_stem': 'Vocals'}
2024-02-05 15:48:57.396 - DEBUG - common_separator - Common params: model_name=UVR_MDXNET_Main, model_path=models/UVR_MDXNET_Main.onnx
2024-02-05 15:48:57.396 - DEBUG - common_separator - Common params: primary_stem_output_path=None, secondary_stem_output_path=None
2024-02-05 15:48:57.396 - DEBUG - common_separator - Common params: output_dir=None, output_format=FLAC
2024-02-05 15:48:57.396 - DEBUG - common_separator - Common params: normalization_threshold=0.9
2024-02-05 15:48:57.396 - DEBUG - common_separator - Common params: enable_denoise=False, output_single_stem=None
2024-02-05 15:48:57.396 - DEBUG - common_separator - Common params: invert_using_spec=False, sample_rate=44100
2024-02-05 15:48:57.396 - DEBUG - common_separator - Common params: primary_stem_name=Vocals, secondary_stem_name=Instrumental
2024-02-05 15:48:57.396 - DEBUG - common_separator - Common params: is_karaoke=False, is_bv_model=False, bv_model_rebalance=0
2024-02-05 15:48:57.396 - DEBUG - mdx_separator - Model params: primary_stem=Vocals, secondary_stem=Instrumental
2024-02-05 15:48:57.396 - DEBUG - mdx_separator - Model params: batch_size=1, compensate=1.043, segment_size=256, dim_f=3072, dim_t=256
2024-02-05 15:48:57.396 - DEBUG - mdx_separator - Model params: n_fft=7680, hop=1024
2024-02-05 15:48:57.396 - DEBUG - mdx_separator - Loading ONNX model for inference...
2024-02-05 15:48:57.656 - DEBUG - mdx_separator - Model loaded successfully using ONNXruntime inferencing session.
2024-02-05 15:48:57.656 - DEBUG - separator - Loading model completed.
2024-02-05 15:48:57.656 - INFO - separator - Load model duration: 00:00:27
2024-02-05 15:48:57.656 - INFO - separator - Starting separation process for audio_file_path: 0822.wav
2024-02-05 15:48:57.656 - DEBUG - mdx_separator - Preparing mix...
2024-02-05 15:48:57.656 - DEBUG - mdx_separator - Loading audio from file: 0822.wav
Traceback (most recent call last):
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\librosa\core\audio.py", line 175, in load
    y, sr_native = __soundfile_load(path, offset, duration, dtype)
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\librosa\core\audio.py", line 208, in __soundfile_load
    context = sf.SoundFile(path)
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\soundfile.py", line 740, in __init__
    self._file = self._open(file, mode_int, closefd)
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\soundfile.py", line 1264, in _open
    _error_check(_snd.sf_error(file_ptr),
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\soundfile.py", line 1455, in _error_check
    raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening '0822.wav': Error in WAV file. No 'data' chunk marker.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\project\test\ds\uvr\.venv\Scripts\audio-separator.exe\__main__.py", line 7, in <module>
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\audio_separator\utils\cli.py", line 116, in main
    output_files = separator.separate(args.audio_file)
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\audio_separator\separator\separator.py", line 540, in separate
    output_files = self.model_instance.separate(audio_file_path)
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\audio_separator\separator\architectures\mdx_separator.py", line 90, in separate
    mix = self.prepare_mix(self.audio_file_path)
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\audio_separator\separator\architectures\mdx_separator.py", line 401, in prepare_mix
    mix, sr = librosa.load(mix, mono=False, sr=self.sample_rate)
  File "C:\project\test\ds\uvr\.venv\lib\site-packages\librosa\core\audio.py", line 177, in load
    except sf.SoundFileRuntimeError as exc:
AttributeError: module 'soundfile' has no attribute 'SoundFileRuntimeError'

(.venv) C:\project\test\ds\uvr>
zhzhongshi commented 8 months ago
C:\project\test\ds\uvr>python uvr.py
2024-02-05 21:04:39,354 - INFO - separator - Separator version 0.14.0 instantiating with output_dir: output/, output_format: WAV
2024-02-05 21:04:39,355 - INFO - separator - Operating System: Windows 10.0.22621
2024-02-05 21:04:39,355 - INFO - separator - System: Windows Node: DESKTOP-R5JDPUC Release: 10 Machine: AMD64 Proc: Intel64 Family 6 Model 154 Stepping 3, GenuineIntel
2024-02-05 21:04:39,355 - INFO - separator - Python Version: 3.10.11
2024-02-05 21:04:39,387 - INFO - separator - ONNX Runtime GPU package installed with version: 1.16.3
2024-02-05 21:04:39,448 - INFO - separator - CUDA is available in Torch, setting Torch device to CUDA
2024-02-05 21:04:39,448 - INFO - separator - ONNXruntime has CUDAExecutionProvider available, enabling acceleration
2024-02-05 21:04:39,448 - INFO - separator - Loading model 5_HP-Karaoke-UVR.pth...
2024-02-05 21:04:39,448 - ERROR - separator - Attempting to calculate hash of model file models/5_HP-Karaoke-UVR.pth
2024-02-05 21:04:39,503 - INFO - vr_separator - VR Separator initialisation complete
2024-02-05 21:04:39,503 - INFO - separator - Load model duration: 00:00:00
2024-02-05 21:04:39,503 - INFO - separator - Starting separation process for audio_file_path: 0822.wav
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:04<00:00,  1.01s/it]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 116/116 [00:00<?, ?it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 116/116 [00:25<00:00,  4.53it/s]
2024-02-05 21:05:19,646 - INFO - vr_separator - Saving Instrumental stem...
Traceback (most recent call last):
  File "C:\project\test\ds\uvr\uvr.py", line 20, in <module>
    primary_stem_path, secondary_stem_path = separator.separate('0822.wav')
  File "C:\Python310\lib\site-packages\audio_separator\separator\separator.py", line 540, in separate
    output_files = self.model_instance.separate(audio_file_path)
  File "C:\Python310\lib\site-packages\audio_separator\separator\architectures\vr_separator.py", line 175, in separate
    self.primary_source = self.spec_to_wav(y_spec).T
  File "C:\Python310\lib\site-packages\audio_separator\separator\architectures\vr_separator.py", line 327, in spec_to_wav
    wav = spec_utils.cmb_spectrogram_to_wave(spec, self.model_params, is_v51_model=self.is_vr_51_model)
  File "C:\Python310\lib\site-packages\audio_separator\separator\uvr_lib_v5\spec_utils.py", line 383, in cmb_spectrogram_to_wave
    wave = librosa.resample(wave2, orig_sr=bp["sr"], target_sr=sr, res_type=wav_resolution)
  File "C:\Python310\lib\site-packages\librosa\core\audio.py", line 670, in resample
    samplerate.resample, axis=axis, arr=y, ratio=ratio, converter_type=res_type
  File "C:\Python310\lib\site-packages\lazy_loader\__init__.py", line 111, in __getattr__
    raise ModuleNotFoundError(
ModuleNotFoundError: No module named 'samplerate'

This error is lazily reported, having originally occured in
  File C:\Python310\lib\site-packages\librosa\core\audio.py, line 31, in <module>

----> samplerate = lazy.load("samplerate")

C:\project\test\ds\uvr>pip install samplerate
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Collecting samplerate
  Downloading https://mirrors.aliyun.com/pypi/packages/72/d7/5d54efb240a691b233a20f5960661e903e5993fddc8a553f804757c0c6ed/samplerate-0.2.1-cp310-cp310-win_amd64.whl (1.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 492.1 kB/s eta 0:00:00
Requirement already satisfied: numpy in c:\python310\lib\site-packages (from samplerate) (1.26.0)
Installing collected packages: samplerate
Successfully installed samplerate-0.2.1
zhzhongshi commented 8 months ago

by install these it works.

pyrubberband
librosa==0.10.0
samplerate==0.2.1

thank you.

beveradb commented 8 months ago

Thanks for the report - I hadn't tested on Windows, and unfortunately I still haven't managed to get cross-platform end to end tests working in CI (that's on my to-do list though!)

I've now published version 0.14.4 which removes the pyrubberband dependency and includes samplerate as a dependency (which is somehow required by librosa on windows but not mac/linux...)

All currently-supported model types now work out of the box on Windows after pip install audio-separator[cpu]:

image