v4 demucs model not running in mps (macos)

sukualam commented 1 year ago

when i checked "GPU Conversion" , it still running on CPU.

actually, i can run demucs with MPS accelerator from this repo: https://github.com/facebookresearch/demucs

python3 -m demucs -d mps PATH_TO_AUDIO_FILE_1

and it works good.

so, what to change in UVR to make it work like the original repo?

my pc: macos 13.4 intel rx 560

Anjok07 commented 12 months ago

Be on the look out for a new patch later this evening. I've extended MPS compatibility to the Demucs v4 and MDX-Net models!

Anjok07 commented 12 months ago

This has been resolved. Please update to UVR_Patch_10_6_23_4_27.

sukualam commented 12 months ago

ok thank you

sukualam commented 12 months ago

yeah finally i can using mdxnet with mps accelerated on my amd rx 560 (macos 13.4). far more better than using cpu only, it very fast for me now.

but for v4 demucs, i it seems error, show popup "An Error Occurred: NotImplementedError ", and this is the log:

Last Error Received:

Process: Demucs

If this error persists, please contact the developers with the error details.

Raw Error Details:

NotImplementedError: "The operator 'aten::_fft_r2c' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS."
Traceback Error: "
  File "UVR.py", line 6565, in process_start
  File "separate.py", line 828, in seperate
  File "separate.py", line 973, in demix_demucs
  File "demucs/apply.py", line 185, in apply_model
  File "demucs/apply.py", line 211, in apply_model
  File "demucs/apply.py", line 245, in apply_model
  File "demucs/utils.py", line 490, in result
  File "demucs/apply.py", line 260, in apply_model
  File "torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "demucs/htdemucs.py", line 538, in forward
  File "demucs/htdemucs.py", line 437, in _spec
  File "demucs/spec.py", line 14, in spectro
  File "torch/functional.py", line 650, in stft
    return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore[attr-defined]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
"

Error Time Stamp [2023-10-07 10:31:11]

Full Application Settings:

vr_model: Choose Model
aggression_setting: 5
window_size: 512
mdx_segment_size: 256
batch_size: Default
crop_size: 256
is_tta: False
is_output_image: False
is_post_process: False
is_high_end_process: False
post_process_threshold: 0.2
vr_voc_inst_secondary_model: No Model Selected
vr_other_secondary_model: No Model Selected
vr_bass_secondary_model: No Model Selected
vr_drums_secondary_model: No Model Selected
vr_is_secondary_model_activate: False
vr_voc_inst_secondary_model_scale: 0.9
vr_other_secondary_model_scale: 0.7
vr_bass_secondary_model_scale: 0.5
vr_drums_secondary_model_scale: 0.5
demucs_model: v4 | htdemucs
segment: Default
overlap: 0.25
overlap_mdx: Default
overlap_mdx23: 8
shifts: 2
chunks_demucs: Auto
margin_demucs: 44100
is_chunk_demucs: False
is_chunk_mdxnet: False
is_primary_stem_only_Demucs: False
is_secondary_stem_only_Demucs: False
is_split_mode: True
is_demucs_combine_stems: True
is_mdx23_combine_stems: True
demucs_voc_inst_secondary_model: No Model Selected
demucs_other_secondary_model: No Model Selected
demucs_bass_secondary_model: No Model Selected
demucs_drums_secondary_model: No Model Selected
demucs_is_secondary_model_activate: False
demucs_voc_inst_secondary_model_scale: 0.9
demucs_other_secondary_model_scale: 0.7
demucs_bass_secondary_model_scale: 0.5
demucs_drums_secondary_model_scale: 0.5
demucs_pre_proc_model: No Model Selected
is_demucs_pre_proc_model_activate: False
is_demucs_pre_proc_model_inst_mix: False
mdx_net_model: UVR-MDX-NET Inst HQ 3
chunks: Auto
margin: 44100
compensate: Auto
denoise_option: None
is_match_frequency_pitch: True
phase_option: Automatic
phase_shifts: None
is_save_align: False
is_match_silence: True
is_spec_match: False
is_mdx_c_seg_def: False
is_invert_spec: False
is_deverb_vocals: False
deverb_vocal_opt: Main Vocals Only
voc_split_save_opt: Lead Only
is_mixer_mode: False
mdx_batch_size: Default
mdx_voc_inst_secondary_model: No Model Selected
mdx_other_secondary_model: No Model Selected
mdx_bass_secondary_model: No Model Selected
mdx_drums_secondary_model: No Model Selected
mdx_is_secondary_model_activate: False
mdx_voc_inst_secondary_model_scale: 0.9
mdx_other_secondary_model_scale: 0.7
mdx_bass_secondary_model_scale: 0.5
mdx_drums_secondary_model_scale: 0.5
is_save_all_outputs_ensemble: True
is_append_ensemble_name: False
chosen_audio_tool: Manual Ensemble
choose_algorithm: Min Spec
time_stretch_rate: 2.0
pitch_rate: 2.0
is_time_correction: True
is_gpu_conversion: True
is_primary_stem_only: False
is_secondary_stem_only: False
is_testing_audio: False
is_auto_update_model_params: True
is_add_model_name: False
is_accept_any_input: False
is_task_complete: False
is_normalization: False
is_wav_ensemble: False
is_create_model_folder: False
mp3_bit_set: 320k
semitone_shift: 0
save_format: MP3
wav_type_set: PCM_16
help_hints_var: True
set_vocal_splitter: No Model Selected
is_set_vocal_splitter: False
is_save_inst_set_vocal_splitter: False
model_sample_mode: False
model_sample_mode_duration: 30
demucs_stems: All Stems
mdx_stems: All Stems

sukualam commented 12 months ago

but i think mdxnet can result better, maybe i will playing with that now.

Anjok07 commented 12 months ago

Thank you for reporting this! I forgot to fix the script for the Intel version. I'll upload a new patch in an hour

Anjok07 commented 12 months ago

yeah finally i can using mdxnet with mps accelerated on my amd rx 560 (macos 13.4). far more better than using cpu only, it very fast for me now.

but for v4 demucs, i it seems error, show popup "An Error Occurred: NotImplementedError ", and this is the log:

Last Error Received:

Process: Demucs

If this error persists, please contact the developers with the error details.

Raw Error Details:

NotImplementedError: "The operator 'aten::_fft_r2c' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS."
Traceback Error: "
  File "UVR.py", line 6565, in process_start
  File "separate.py", line 828, in seperate
  File "separate.py", line 973, in demix_demucs
  File "demucs/apply.py", line 185, in apply_model
  File "demucs/apply.py", line 211, in apply_model
  File "demucs/apply.py", line 245, in apply_model
  File "demucs/utils.py", line 490, in result
  File "demucs/apply.py", line 260, in apply_model
  File "torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "demucs/htdemucs.py", line 538, in forward
  File "demucs/htdemucs.py", line 437, in _spec
  File "demucs/spec.py", line 14, in spectro
  File "torch/functional.py", line 650, in stft
    return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore[attr-defined]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
"

Error Time Stamp [2023-10-07 10:31:11]

Full Application Settings:

vr_model: Choose Model
aggression_setting: 5
window_size: 512
mdx_segment_size: 256
batch_size: Default
crop_size: 256
is_tta: False
is_output_image: False
is_post_process: False
is_high_end_process: False
post_process_threshold: 0.2
vr_voc_inst_secondary_model: No Model Selected
vr_other_secondary_model: No Model Selected
vr_bass_secondary_model: No Model Selected
vr_drums_secondary_model: No Model Selected
vr_is_secondary_model_activate: False
vr_voc_inst_secondary_model_scale: 0.9
vr_other_secondary_model_scale: 0.7
vr_bass_secondary_model_scale: 0.5
vr_drums_secondary_model_scale: 0.5
demucs_model: v4 | htdemucs
segment: Default
overlap: 0.25
overlap_mdx: Default
overlap_mdx23: 8
shifts: 2
chunks_demucs: Auto
margin_demucs: 44100
is_chunk_demucs: False
is_chunk_mdxnet: False
is_primary_stem_only_Demucs: False
is_secondary_stem_only_Demucs: False
is_split_mode: True
is_demucs_combine_stems: True
is_mdx23_combine_stems: True
demucs_voc_inst_secondary_model: No Model Selected
demucs_other_secondary_model: No Model Selected
demucs_bass_secondary_model: No Model Selected
demucs_drums_secondary_model: No Model Selected
demucs_is_secondary_model_activate: False
demucs_voc_inst_secondary_model_scale: 0.9
demucs_other_secondary_model_scale: 0.7
demucs_bass_secondary_model_scale: 0.5
demucs_drums_secondary_model_scale: 0.5
demucs_pre_proc_model: No Model Selected
is_demucs_pre_proc_model_activate: False
is_demucs_pre_proc_model_inst_mix: False
mdx_net_model: UVR-MDX-NET Inst HQ 3
chunks: Auto
margin: 44100
compensate: Auto
denoise_option: None
is_match_frequency_pitch: True
phase_option: Automatic
phase_shifts: None
is_save_align: False
is_match_silence: True
is_spec_match: False
is_mdx_c_seg_def: False
is_invert_spec: False
is_deverb_vocals: False
deverb_vocal_opt: Main Vocals Only
voc_split_save_opt: Lead Only
is_mixer_mode: False
mdx_batch_size: Default
mdx_voc_inst_secondary_model: No Model Selected
mdx_other_secondary_model: No Model Selected
mdx_bass_secondary_model: No Model Selected
mdx_drums_secondary_model: No Model Selected
mdx_is_secondary_model_activate: False
mdx_voc_inst_secondary_model_scale: 0.9
mdx_other_secondary_model_scale: 0.7
mdx_bass_secondary_model_scale: 0.5
mdx_drums_secondary_model_scale: 0.5
is_save_all_outputs_ensemble: True
is_append_ensemble_name: False
chosen_audio_tool: Manual Ensemble
choose_algorithm: Min Spec
time_stretch_rate: 2.0
pitch_rate: 2.0
is_time_correction: True
is_gpu_conversion: True
is_primary_stem_only: False
is_secondary_stem_only: False
is_testing_audio: False
is_auto_update_model_params: True
is_add_model_name: False
is_accept_any_input: False
is_task_complete: False
is_normalization: False
is_wav_ensemble: False
is_create_model_folder: False
mp3_bit_set: 320k
semitone_shift: 0
save_format: MP3
wav_type_set: PCM_16
help_hints_var: True
set_vocal_splitter: No Model Selected
is_set_vocal_splitter: False
is_save_inst_set_vocal_splitter: False
model_sample_mode: False
model_sample_mode_duration: 30
demucs_stems: All Stems
mdx_stems: All Stems

I fixed the Demucs v4 mps issue. Please download and install it again - https://github.com/Anjok07/ultimatevocalremovergui/releases/download/v5.6/Ultimate_Vocal_Remover_v5_6_MacOS_x86_64.dmg

sukualam commented 12 months ago

ok, i will try again tonight

sukualam commented 12 months ago

ok thanks, the mdxnet and v4 demucs now work normally on uvr with mps.

maybe i will try benchmark the v4 demucs on uvr and the original demucs, cuz i maybe notice speed difference but forgot

sukualam commented 12 months ago

file: akad.m4a (8.4mb / 04.18 minutes)

on UVR (v4 htdemucs) with mps (all default, only check GPU conversion):

Process complete
Time Elapsed: 00:02:45

on demucs terminal (python3 -m demucs -d mps akad.m4a)

100%|██████████████████████████████████████████████████████████████████████| 263.25/263.25 [01:20<00:00, 3.27seconds/s]

it looks the uvr still both balancing cpu + gpu , while terminal version is more gpu process and less cpu, and maybe make it more faster

sukualam commented 12 months ago

the mdxnet is more faster than demucs btw (file still same, akad.m4a)

File 1/1 Loading model (UVR-MDX-NET-Inst_HQ_3)... Done!
File 1/1 Running inference... Done!
File 1/1 Saving Vocals stem... Done!
File 1/1 Saving Instrumental stem... Done!

Process complete
Time Elapsed: 00:01:34

sukualam commented 12 months ago

after all, its all good for me now btw

Anjok07 commented 12 months ago

file: akad.m4a (8.4mb / 04.18 minutes)

on UVR (v4 htdemucs) with mps (all default, only check GPU conversion):
Process complete
Time Elapsed: 00:02:45
on demucs terminal (python3 -m demucs -d mps akad.m4a)

100%|██████████████████████████████████████████████████████████████████████| 263.25/263.25 [01:20<00:00, 3.27seconds/s]

it looks the uvr still both balancing cpu + gpu , while terminal version is more gpu process and less cpu, and maybe make it more faster

This might be because UVR sets shifts to 2 by default and the terminal version has it set to 0. Try setting the shifts to 0 in the "Demucs Advanced Menu" and you should see it speed up.

Anjok07 / ultimatevocalremovergui

v4 demucs model not running in mps (macos) #863