fgnt / pb_bss

Collection of EM algorithms for blind source separation of audio signals
MIT License
265 stars 60 forks source link

Evaluation metric name alias #28

Closed xixihahaggg closed 3 years ago

xixihahaggg commented 3 years ago

Hi, I'm a little confused about the SDR in the output metrics. In the paper https://arxiv.org/pdf/1811.02508.pdf, it referred to the original SNR. Does the SDR also refer to SNR in this toolkit? Or it refers to the SD-SDR.

boeddeker commented 3 years ago

Hi, I am not sure, what you exactly mean, because we have nothing called SDR. I guess you mean either mir_eval_sdr or invasive_sdr.

Our mir_eval_sdr uses mir_eval.separation.bss_eval_sources from the package mir_eval. So mir_eval_sdr uses a python port from Vincent's SDR implementation [3] (Original is MATLAB). If I remember correct, a long form name of it is: BSSEval sources SDR v3 (I think it was version 3).

When you want to check the equations, you can check our paper [2] Eq. 6 (In my mind, we used a much easier description, than the original work [3] or [1]).

We have no implementation of SD-SDR, essentially, because we don't need it (I haven't seen, a second paper that uses it.). In [1] the authors worked on a single channel non-reverberant data and from Table 1 you can draw the conclusion, that BSSEval SDR and SI-SDR measure the same thing on non-reverberant, so both can be used. We often work on a multichannel reverberant data. On such data, the SI-SDR has no unique definition (i.e. it is not clear what the target is) and a comparison between different systems is difficult, because they need different target definitions.

invasive_sdr is an SDR, that is motivated by LTI systems. You can see the definition in [2] Eq. 7. Long time it was not published, so except us, maybe no one uses it.

[1] SDR – HALF-BAKED OR WELL DONE? https://arxiv.org/pdf/1811.02508.pdf [2] SMS-WSJ https://arxiv.org/pdf/1910.13934.pdf [3] E. Vincent, R. Gribonval, and C. Fevotte, "Performance measurement in blind audio source separation"

xixihahaggg commented 3 years ago

Great! Thank you for your explanation! To my understanding, the SDR is equivalent to the usual SNR (Signal to Noise Ratio). And the SNR (Sources to Noise Ratio) as defined in [3], which adds the error inferences term in the numerator, is seldom used.

boeddeker commented 3 years ago

SDR means Signal to Distortion ratio and for source separation, distortion is the interference (i.e. cross talker) plus noise. In many scenarios, it is equivalent to SNR (e.g. noise suppression).

Calling it just SDR or SNR is imprecise, especially, when different SDR or SNR values are reported. e.g. SI-SDR and BSSEval SDR are both ideas to estimate an SDR.

The SNR in [3] is defined for source separation. I don't remember a publication, where someone has reported this value. The people usually report the SDR from [3] and maybe report the SDR and rename it to SNR, when they have a problem without cross talker.