Closed ybwai closed 3 months ago
How did you install it? You're using a version from a year ago with more bugs than the latest version 😅
Separator version 0.14.3 instantiating
oh bugger, I added this to my requirements txt:
audio-separator[cpu]; sys_platform == 'darwin'
audio-separator[gpu]; sys_platform != 'darwin'
and did a pip install. let figure out why its getting an old version.. sorry :)
On a mac, pip install "audio-separator[cpu]"
should be all you need since you don't need to worry about CUDA!
audio-separator[cpu]>=0.17.5; sys_platform == 'darwin'
audio-separator[gpu]>=0.17.5; sys_platform != 'darwin'
seems to have fixed it. this allows me to use Mac locally and CUDA on modal
0.17.5 definitely working a lot better. MDX23C now works and results are good thanks!
Sadly still having issues with the VR Architecture models:
PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 ./.venv/bin/python -m sandbox.separator
2024-07-20 18:23:13,812 - INFO - separator - Separator version 0.17.5 instantiating with output_dir: None, output_format: WAV
2024-07-20 18:23:13,812 - INFO - separator - Output directory not specified. Using current working directory.
2024-07-20 18:23:13,812 - INFO - separator - Operating System: Darwin Darwin Kernel Version 23.5.0: Wed May 1 20:09:52 PDT 2024; root:xnu-10063.121.3~5/RELEASE_X86_64
2024-07-20 18:23:13,822 - INFO - separator - System: Darwin Node: xxxs-MacBook-Pro-2.local Release: 23.5.0 Machine: x86_64 Proc: i386
2024-07-20 18:23:13,822 - INFO - separator - Python Version: 3.10.14
2024-07-20 18:23:13,822 - INFO - separator - PyTorch Version: 2.2.2
2024-07-20 18:23:13,939 - INFO - separator - FFmpeg installed: ffmpeg version 7.0 Copyright (c) 2000-2024 the FFmpeg developers
2024-07-20 18:23:13,941 - INFO - separator - ONNX Runtime CPU package installed with version: 1.18.1
2024-07-20 18:23:13,958 - INFO - separator - Apple Silicon MPS/CoreML is available in Torch, setting Torch device to MPS
2024-07-20 18:23:13,958 - INFO - separator - ONNXruntime has CoreMLExecutionProvider available, enabling acceleration
2024-07-20 18:23:13,958 - INFO - separator - Loading model 3_HP-Vocal-UVR.pth...
2024-07-20 18:23:14,824 - INFO - vr_separator - VR Separator initialisation complete
2024-07-20 18:23:14,825 - INFO - separator - Load model duration: 00:00:00
2024-07-20 18:23:14,825 - INFO - separator - Starting separation process for audio_file_path: /sandbox/downloads/dl.mp3
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:04<00:00, 1.07s/it]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 84/84 [00:00<00:00, 635959.45it/s]
0%| | 0/6 [00:22<?, ?it/s]
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.10/3.10.14/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/Cellar/python@3.10/3.10.14/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/sandbox/separator.py", line 141, in <module>
output_files = separator.separate(input_path)
File "/.venv/lib/python3.10/site-packages/audio_separator/separator/separator.py", line 704, in separate
output_files = self.model_instance.separate(audio_file_path)
File "/.venv/lib/python3.10/site-packages/audio_separator/separator/architectures/vr_separator.py", line 150, in separate
y_spec, v_spec = self.inference_vr(self.loading_mix(), self.torch_device, self.aggressiveness)
File "/.venv/lib/python3.10/site-packages/audio_separator/separator/architectures/vr_separator.py", line 324, in inference_vr
mask = _execute(X_mag_pad, roi_size)
File "/.venv/lib/python3.10/site-packages/audio_separator/separator/architectures/vr_separator.py", line 291, in _execute
pred = self.model_run.predict_mask(X_batch)
File "/.venv/lib/python3.10/site-packages/audio_separator/separator/uvr_lib_v5/vr_network/nets.py", line 169, in predict_mask
mask = self.forward(input_tensor)
File "/.venv/lib/python3.10/site-packages/audio_separator/separator/uvr_lib_v5/vr_network/nets.py", line 148, in forward
hidden_state = self.stg3_full_band_net(self.stg3_bridge(hidden_state))
File "/.venv/lib/python3.10/site-packages/audio_separator/separator/uvr_lib_v5/vr_network/nets.py", line 62, in __call__
hidden_state = self.dec1(hidden_state, encoder_output1)
File "/.venv/lib/python3.10/site-packages/audio_separator/separator/uvr_lib_v5/vr_network/layers.py", line 179, in __call__
input_tensor = F.interpolate(input_tensor, scale_factor=2, mode="bilinear", align_corners=True)
File "/.venv/lib/python3.10/site-packages/torch/nn/functional.py", line 4038, in interpolate
return torch._C._nn.upsample_bilinear2d(input, output_size, align_corners, scale_factors)
RuntimeError: Invalid buffer size: 3.00 GB
Also with the roformer
models I get this:
PYTORCH_ENABLE_MPS_FALLBACK=1 PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 ./.venv/bin/python -m sandbox.separator
2024-07-20 18:29:58,351 - INFO - separator - Separator version 0.17.5 instantiating with output_dir: None, output_format: WAV
2024-07-20 18:29:58,351 - INFO - separator - Output directory not specified. Using current working directory.
2024-07-20 18:29:58,351 - INFO - separator - Operating System: Darwin Darwin Kernel Version 23.5.0: Wed May 1 20:09:52 PDT 2024; root:xnu-10063.121.3~5/RELEASE_X86_64
2024-07-20 18:29:58,362 - INFO - separator - System: Darwin Node: xxxs-MacBook-Pro-2.local Release: 23.5.0 Machine: x86_64 Proc: i386
2024-07-20 18:29:58,362 - INFO - separator - Python Version: 3.10.14
2024-07-20 18:29:58,362 - INFO - separator - PyTorch Version: 2.2.2
2024-07-20 18:29:58,715 - INFO - separator - FFmpeg installed: ffmpeg version 7.0 Copyright (c) 2000-2024 the FFmpeg developers
2024-07-20 18:29:58,717 - INFO - separator - ONNX Runtime CPU package installed with version: 1.18.1
2024-07-20 18:29:58,738 - INFO - separator - Apple Silicon MPS/CoreML is available in Torch, setting Torch device to MPS
2024-07-20 18:29:58,738 - INFO - separator - ONNXruntime has CoreMLExecutionProvider available, enabling acceleration
2024-07-20 18:29:58,738 - INFO - separator - Loading model model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt...
2024-07-20 18:30:04,131 - INFO - mdxc_separator - MDXC Separator initialisation complete
2024-07-20 18:30:04,133 - INFO - separator - Load model duration: 00:00:05
2024-07-20 18:30:04,134 - INFO - separator - Starting separation process for audio_file_path: sandbox/downloads/dl.mp3
0%| | 0/32 [00:26<?, ?it/s]
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.10/3.10.14/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/Cellar/python@3.10/3.10.14/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "sandbox/separator.py", line 141, in <module>
output_files = separator.separate(input_path)
File ".venv/lib/python3.10/site-packages/audio_separator/separator/separator.py", line 704, in separate
output_files = self.model_instance.separate(audio_file_path)
File ".venv/lib/python3.10/site-packages/audio_separator/separator/architectures/mdxc_separator.py", line 134, in separate
source = self.demix(mix=mix)
File ".venv/lib/python3.10/site-packages/audio_separator/separator/architectures/mdxc_separator.py", line 248, in demix
x = self.model_run(part.unsqueeze(0))[0]
File ".venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File ".venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File ".venv/lib/python3.10/site-packages/audio_separator/separator/uvr_lib_v5/mel_band_roformer.py", line 462, in forward
masks.cpu() if x_is_mps else masks).to(device)
RuntimeError: Unsupported type byte size: ComplexFloat
guessing Intel Macbook no good for this?
Hmm, I don't have an intel mac to test on, but this is a bit confusing to me 🤔
Apple Silicon MPS/CoreML is available in Torch, setting Torch device to MPS
That doesn't make sense to me, as I thought it was only Apple Silicon macs which supported the Apple MPS GPU acceleration: https://developer.apple.com/documentation/metalperformanceshaders
Here's the code where that is checked for in audio-separator
:
https://github.com/nomadkaraoke/python-audio-separator/blob/main/audio_separator/separator/separator.py#L225
it then calls this function which is where the torch device is set to use MPS: https://github.com/nomadkaraoke/python-audio-separator/blob/main/audio_separator/separator/separator.py#L251
I'd appreciate if you could test changing that to stop it from configuring MPS (forcing it to use your CPU only for inferencing) - e.g. this change: https://github.com/nomadkaraoke/python-audio-separator/pull/91/files
If that works and makes all models inference without error for you, then we need to improve the way we detect support for Apple MPS in audio-separator
to ensure it doesn't try to use it on Intel Macs!
That seems to have done it yes.
Maybe the issue is PyTorch not classifying my MacBook properly as having a low power GPU - it has a Radeon PRO 5300M (4GB) which would explain the buffer size issue.
Gotcha, thank you for confirming!
I've just released audio-separator
version 0.17.6
with a fix for this - basically I'm just detecting the processor type and only enabling MPS if it's ARM.
I think PyTorch doesn't really support MPS properly on Intel Mac GPUs unfortunately, so this is probably the best option for now so things at least work out of the box for folks like you, even if that means ignoring your GPU unfortunately.
Lovely will upgrade to it.
Also, some UVR models (5_HP, 6_HP & UVR-BVE) are just giving me empty sound files and Demucs models complain about a missing _tkinter module. Tried a brew install python-tk
but no luck.
I think those VR models are a known issue I'm afraid, I'd love for the BVE one in particular to work in audio-separator
but it doesn't and I haven't prioritized trying to figure out why yet:
https://github.com/nomadkaraoke/python-audio-separator/issues/45
Contributions very much welcome 🙏
Not sure about the demucs issue, does it actually fail? If so, please share debug logs as I've used the htdemucs_6s.yaml
model a bunch without issues!
Demucs doesn't seem to start:
2024-07-20 23:42:45,855 - INFO - separator - Separator version 0.17.5 instantiating with output_dir: None, output_format: WAV
2024-07-20 23:42:45,855 - INFO - separator - Output directory not specified. Using current working directory.
2024-07-20 23:42:45,855 - INFO - separator - Operating System: Darwin Darwin Kernel Version 23.5.0: Wed May 1 20:09:52 PDT 2024; root:xnu-10063.121.3~5/RELEASE_X86_64
2024-07-20 23:42:45,864 - INFO - separator - System: Darwin Node: xxx-MacBook-Pro-2.local Release: 23.5.0 Machine: x86_64 Proc: i386
2024-07-20 23:42:45,865 - INFO - separator - Python Version: 3.10.14
2024-07-20 23:42:45,865 - INFO - separator - PyTorch Version: 2.2.2
2024-07-20 23:42:51,950 - INFO - separator - FFmpeg installed: ffmpeg version 7.0.1 Copyright (c) 2000-2024 the FFmpeg developers
2024-07-20 23:42:51,951 - INFO - separator - ONNX Runtime CPU package installed with version: 1.18.1
2024-07-20 23:42:51,951 - INFO - separator - No hardware acceleration could be configured, running in CPU mode
2024-07-20 23:42:51,952 - INFO - separator - Loading model htdemucs_6s.yaml...
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.10/3.10.14_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/Cellar/python@3.10/3.10.14_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "sandbox/separator.py", line 126, in <module>
separator.load_model(
File ".venv/lib/python3.10/site-packages/audio_separator/separator/separator.py", line 673, in load_model
module = importlib.import_module(f"audio_separator.separator.architectures.{module_name}")
File "/usr/local/Cellar/python@3.10/3.10.14_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File ".venv/lib/python3.10/site-packages/audio_separator/separator/architectures/demucs_separator.py", line 7, in <module>
from audio_separator.separator.uvr_lib_v5.demucs.apply import apply_model, demucs_segments
File ".venv/lib/python3.10/site-packages/audio_separator/separator/uvr_lib_v5/demucs/apply.py", line 19, in <module>
import tkinter as tk
File "/usr/local/Cellar/python@3.10/3.10.14_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/tkinter/__init__.py", line 37, in <module>
import _tkinter # If this fails your Python may not be configured for Tk
ModuleNotFoundError: No module named '_tkinter'
Huh, that's very strange, for two reasons:
python -m tkinter
btw.Tkinter
is the standard Python interface to the Tk GUI toolkit, which should only be relevant for GUI applications... which audio-separator
strictly is not :) It shouldn't be relevant for this project in any way. Apparently the demucs inferencing code I copied from UVR had completely unused Tkinter
imports in it 🙃 So anyway, I released another new version, audio-separator
version 0.18.3
which removes those references and should fix demucs for you 🙏
Demucs working on new version 0.18.3!
The
MDX23C...ckpt
type models seem to give me errorAnd all the
HP_...pth
uvr models seem to give me aRuntimeError: Invalid buffer size: 3.94 GB
The
.onnx
models are working fine.I am running on a intel i9 2.4Ghz MacBook Pro with 32GB RAM.
I can use other cli tools to run the uvr HP_ models just fine, so not sure why this runner can't.
Any suggestions?
Here are some full logs if they help:
MDX23:
UVR HP: