Open avis-ma opened 5 years ago
I am getting the same result - all metrics are 0. @avis-ma Did you solve this?
EDIT: This bug, if it is indeed a bug - would be great if the authors confirmed this, might have come from line 129 in file dataset/base.py:
audio_raw *= (2.0**-31)
From the comment of the authors, this line is supposed to normalize the output of the torchaudio.load()
function to the range [-1, 1]. However, this normalization is already done by the torchaudio.load()
function itself (see https://pytorch.org/audio/#torchaudio.load). Therefore, what line 129 does is effectively making everything in audio_raw
becomes zeros.
This all-zero audio_raw
make all the metrics zeros as well, because later in the function calc_metrics()
in file main.py
, line 150, there is a check whether the ground truth audio is all-zero or not. If it is, then no metric calculation is carried out. Since the audio loaded from the dataset is always all-zero, this check is always true and hence the metrics were never calculated.
The fix is simple: comment out line 129 in file dataset/base.py. After doing so I got something like this:
[Eval Summary] Epoch: 0, Loss: 0.2974, SDR_mixture: 1.4887, SDR: 3.9951, SIR: 9.2085, SAR: 10.6352
Plotting html for visualization...
I am getting the same result - all metrics are 0. @avis-ma Did you solve this?
EDIT: This bug, if it is indeed a bug - would be great if the authors confirmed this, might have come from line 129 in file dataset/base.py:
audio_raw *= (2.0**-31)
From the comment of the authors, this line is supposed to normalize the output of the
torchaudio.load()
function to the range [-1, 1]. However, this normalization is already done by thetorchaudio.load()
function itself (see https://pytorch.org/audio/#torchaudio.load). Therefore, what line 129 does is effectively making everything inaudio_raw
becomes zeros.This all-zero
audio_raw
make all the metrics zeros as well, because later in the functioncalc_metrics()
in filemain.py
, line 150, there is a check whether the ground truth audio is all-zero or not. If it is, then no metric calculation is carried out. Since the audio loaded from the dataset is always all-zero, this check is always true and hence the metrics were never calculated.The fix is simple: comment out line 129 in file dataset/base.py. After doing so I got something like this:
[Eval Summary] Epoch: 0, Loss: 0.2974, SDR_mixture: 1.4887, SDR: 3.9951, SIR: 9.2085, SAR: 10.6352 Plotting html for visualization...
Thanks @ngmq , the data scale really matters. However, the training process still cannot converge. After 2 training epochs, the loss hovers around some value, say 0.20, and the predicted two masks are also similar. Did you encounter this problem before?
@zjsong That did not happen to me, my training went fine. Maybe checking the input data would help?
@ngmq Thanks for your reply. I just found if the training process takes enough steps forward (e.g., >25 epochs), it would show promising results accordingly.
Hello, I am a Chinese student. I have pre-processed the dataset, and use the train_MUSIC.sh to train the default model. But the result is not what I supposed. The metrics is all 0. Even I directly use the eval_MUSIC.sh (I have downloaded the trained model), I also get the 0 metics(SDR ,SIR, .etc). I don't change the code that you submit in github. So how can I find what the problem is?