etzinis / sudo_rm_rf

Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of separating sources from mixtures.
MIT License
307 stars 34 forks source link

Obtained results of SI-SNRi=18.9? #2

Closed JusperLee closed 4 years ago

JusperLee commented 4 years ago

What kind of script can be used to make SI-SNRi reach 18.9?

etzinis commented 4 years ago

python run_improved_sudormrf.py --train WHAM --val WHAM --test WHAM --train_val WHAM --separation_task sep_clean --n_train 20000 --n_test 3000 --n_val 3000 --n_train_val 3000 --out_channels 512 --num_blocks 34 -cad 0 1 -bs 4 --divide_lr_by 3. --upsampling_depth 5 --patience 49 -fs 8000 -tags sudo_rm_rf_34 --project_name sudormrf_wham --zero_pad --clip_grad_norm 5.0 --model_type relu

JusperLee commented 4 years ago

thanks a lot

JusperLee commented 4 years ago

I can't get the result of 18.9 with this parameter. I would like to ask if you have a log file for reference. @etzinis

etzinis commented 4 years ago

What is the result you are getting ? Are you using my code 100%?

etzinis commented 4 years ago

I double checked the result with my implementation here and an asteroid implementation as well

JusperLee commented 4 years ago

val_sisnri: 18.205505715329497. I used your code 100%.

etzinis commented 4 years ago

Also you have to let ot run for 200 epochs

JusperLee commented 4 years ago

but my code batch-size is not 4, I set it 16.

etzinis commented 4 years ago

So it's not the same...

JusperLee commented 4 years ago

ok, i will set batch size to 4 and test again.

etzinis commented 4 years ago

this is validation on test: image

etzinis commented 4 years ago

I used patience = 30 here but this will not make a difference

JusperLee commented 4 years ago

I will try again to see if I can achieve this result.

etzinis commented 4 years ago

Btw how did you manage to fit a batch size of 16 in 2 GPUs or did you use more?

JusperLee commented 4 years ago

with 8 gpus, each gpu is a batch size of 2.

etzinis commented 4 years ago

Wow glad that you have the resources.

etzinis commented 4 years ago

Also make sure you have the latest version of the code: git pull

JusperLee commented 4 years ago

image Is this result normal?

etzinis commented 4 years ago

yes totally normal. You have not even reduced the learnign rate for a second time and you are already at 18.2 probably you will score better than the paper :P

JusperLee commented 4 years ago

That's great, thanks a lot.

JusperLee commented 4 years ago

Hello, I cannot achieve the results in the paper by any method (SI-SNRi=18.9). I want to know a more detailed trick. @etzinis image

etzinis commented 4 years ago

How much are you getting?

JusperLee commented 4 years ago

SI-SNRi = 18.66

etzinis commented 4 years ago

First of all, if I remember corresctly you were running with different batch size or something on multiple GPUs so that's one difference.

Despite that, round(18.66, 1) = 18.7 => 18.9 - 18.7 = 0.2 dB (almost 1% relative difference) which is not statistically neither acoustically different by any means.

etzinis commented 4 years ago

There is no other trick besides running the code as specified in the paper.

JusperLee commented 4 years ago

Okay, then there is no problem.

JusperLee commented 4 years ago

Thank you very much for your answers