Open bmcfee opened 2 weeks ago
@bmcfee thanks for the initiative. Before we merge this, I would guess we should get some feedback from the community. museval might be a good fit as its an implementation of the bsseval images
(unlike _sources
) algorithm but it might needs some work to make it future proof as well: wrt to the precision issue, I also had to lower the regression tests tolerance https://github.com/sigsep/sigsep-mus-eval/pull/93/files#diff-6edc05e4134fe68983d08319a22f62d5e4efdd829656595282cf6ef97011c13aL90 myself (testing against older version of museval with older numpy version).
Here are some people that I would def value their opinion on how to proceed with the sdr metrics:
@rabitt @fakufaku @iver56 @adefossez
Attention: Patch coverage is 83.33333%
with 2 lines
in your changes missing coverage. Please review.
Project coverage is 86.66%. Comparing base (
485a425
) to head (da5bf81
).
Files | Patch % | Lines |
---|---|---|
mir_eval/util.py | 71.42% | 2 Missing :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks. Yes, let's disable the tests too.
Done.
museval might be a good fit as its an implementation of the bsseval
images
(unlike_sources
) algorithm but it might needs some work to make it future proof as well: wrt to the precision issue, I also had to lower the regression tests tolerance https://github.com/sigsep/sigsep-mus-eval/pull/93/files#diff-6edc05e4134fe68983d08319a22f62d5e4efdd829656595282cf6ef97011c13aL90 myself
I think the precision issue will always be there as long as we're relying on unregularized least squares; even moreso if we're dynamically switching between solve() and lstsq(), as mir_eval and museval both do. This dynamic behavior, IIRC, was intended to mimic what matlab's \ operator does under the hood, and it's always seemed troublesome to me. Museval's implementation at least puts a ridge on the solve(), so it really shouldn't ever hit the lstsq path, but as I noted in the other PR thread #381 , the exception handling there isn't quite correct anymore.
My not-so-humble opinion is that all of these implementations should probably be scrapped and rewritten :sweat_smile: But if we're going to do that (as a community), it should only happen once, and in a dedicated package.
Side note: failures on 3.12 builds are due to matplotlib removeing BrokenBarHCollection. Will fix that in a separate PR.
Is there a way to tell codecov not to worry about the separation file's lines so it hates us less?
Jumping in the discussion here. I have written a fast bss_eval_sources
package (fast_bss_eval). At the time, the fast routine was quite faster than the sigsep/bsseval implementation. It also supports numpy/torch, fp32/fp64, cpu/gpu, and solve
or conjugate GD.
I think the implementation ideas are applicable to the _images
routine for which the savings are potentially even larger.
edit: paper link
I'm okay with this. I don't calculate SIR, SAR and ISR anymore when evaluating source separation models/algorithms. I still calculate SDR (with a stripped fork of sigsep/bsseval), but I mostly don't use it. I mainly rely on logWMSE in combination with a few others.
Is there a way to tell codecov not to worry about the separation file's lines so it hates us less?
I've added separation to the omit section, but it might take a merge before codecov stops comparing to old results that did include it.
Side note: failures on 3.12 builds are due to matplotlib removing BrokenBarHCollection. Will fix that in a separate PR.
Actually just rolled it into this one - all that needed to happen was to remove an import tweak a docstring, no actual code changes.
This PR deprecates the separation module.
Question for @faroit - is sigsep-museval the best replacement to recommend here?
Question for @craffel - should we disable tests on the separation module? They've always been finnicky and the vast majority of build time. If we're deprecating the module anyway, I could see some benefit to disabling these tests.