sigsep / open-unmix-pytorch

Open-Unmix - Music Source Separation for PyTorch
https://sigsep.github.io/open-unmix/
MIT License
1.27k stars 191 forks source link

Shape broadcasting error in norbert filter _init__.py while referring v1.0.0 #111

Closed yugaljain1999 closed 2 years ago

yugaljain1999 commented 2 years ago

@faroit I was referring to v1.0.0 of this repo as there you implemented numpy version of weiner filter instead of torch but facing broadcasting shape error under softmask=False 'filter*x[...,None]' in weiner filter function. Can you please help me in this as its very important for me to solve this error ASAP? What changes should I do while preprocessing audio before feeding into separator function there or something else?
Can you please tell me how can I solve that error in v1.0.0 because for whole setup here we need to refer only that version of open-unmix-pytorch not the latest release of this repo. It would be really appreciable if you take a look at this issue soon. Thanks

faroit commented 2 years ago

Not sure if I understand. Can you post a minimal working example that shoes the error?

yugaljain1999 commented 2 years ago

@faroit I tried to run v1.0.0 by checking out commit 3f6a42110c4aa77cedd66770e20aecdcecbb348c

Version: Pytorch 1.10 Python 3.7.4 OS - Windows 10 MUSDB18-HQ dataset I downloaded for evaluation - https://sigsep.github.io/datasets/musdb.html#musdb18-hq-uncompressed-wav

Commands I referred to run eval.py script using pretrained MUSDB18-HQ uncompressed dataset on 'umxhq' model as follows: git clone https://github.com/sigsep/open-unmix-pytorch.git git checkout --progress --force 3f6a42110c4aa77cedd66770e20aecdcecbb348c python eval.py --outdir museval/results --evaldir museval/estimates --model umxhq --is-wav True --no-cuda True Setting softmax=False

After running above command and completion of first iteration I got below error:

` 0%| | 0/50 [00:00<?, ?it/s]AUDIO SHAPE (9265664, 2) D:\Deep_Learning\Anaconda3\lib\site-packages\torch\functional.py:1069: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ..\aten\src\ATen\native\TensorShape.cpp:2157.) return _VF.cartesian_prod(tensors) # type: ignore[attr-defined] 100%|████████████████████████████████████████████████████████████████████████| 4/4 [18:12<00:00, 273.22s/it] ███████████████████████████| 4/4 [18:12<00:00, 277.32s/it]

Traceback (most recent call last): File "eval.py", line 157, in device=device File "eval.py", line 30, in separate_and_evaluate device=device File "D:\audio-sep2\open-unmix-pytorch\test.py", line 165, in separate use_softmask=softmask) File "D:\Deep_Learning\Anaconda3\lib\site-packages\norbert__init__.py", line 286, in wiener y[...,0,:] = v*(np.cos(angle)[...,None]) ValueError: operands could not be broadcast together with shapes (1,9265664,2,4) (9049,2049,1,1)`

Hoping to hear from you soon. Thanks

faroit commented 2 years ago

@yugaljain1999 you would need to run that version with the respective torch environment, that version had

yugaljain1999 commented 2 years ago

@faroit Thanks for you reply but I have used that too as well but encountered same error :( Is there any other solution? Can you please try it yourself, you will clearly observe it?

faroit commented 2 years ago

In your first comment I see that you were using torch 1.10. That's not the version that was used for open-unmix 1.0.0

yugaljain1999 commented 2 years ago

That's right @faroit in v1. 0.0 you mentioned pytorch 1.2 torchaudio 0.3 for linux OS. But still by using that version recently I am getting same error. That's why please suggest me what should I do now? If you can install that env in your pc and test then you can observe that error.

Hoping to hear back from you soon :)

aliutkus commented 2 years ago

it looks to me you are trying to multiply a waveform with something happening in the frequency domain: (1,9265664,2,4) looks like a waveform, is v really a spectrogram here ?

yugaljain1999 commented 2 years ago

Yes v is targets spectrogram here

On Sat, 20 Nov, 2021, 2:05 am Antoine Liutkus, @.***> wrote:

it looks to me you are trying to multiply a waveform with something happening in the frequency domain: (1,9265664,2,4) looks like a waveform, is v really a spectrogram here ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sigsep/open-unmix-pytorch/issues/111#issuecomment-974439333, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHY3MAGRQKZNBVEW744L3EDUM2YJTANCNFSM5IKKDXNA .

faroit commented 2 years ago

@yugaljain1999 i tested v1.0.0 on my machine. It works fine but you have to make sure that you use a local model.pth file or modify the torch hub loader so that uses v1.0.0 from github:

diff --git a/test.py b/test.py
index 00873a8..74902b8 100644
--- a/test.py
+++ b/test.py
@@ -28,7 +28,7 @@ def load_model(target, model_name='umxhq', device='cpu'):
             err = io.StringIO()
             with redirect_stderr(err):
                 return torch.hub.load(
-                    'sigsep/open-unmix-pytorch',
+                    'sigsep/open-unmix-pytorch:v1.0.0',
                     model_name,
                     target=target,
                     device=device,