craffel / mir_eval

Evaluation functions for music/audio information retrieval/signal processing algorithms.
MIT License
588 stars 109 forks source link

tempo: why is tol=0.0 not allowed #300

Closed hendriks73 closed 5 years ago

hendriks73 commented 5 years ago

The current code does not allow zero tolerance, i.e. tol=0.0. I don't see any reason why. Is there one?

But there is a practical use-case for allowing 0.0:

When creating a graph that shows p-score over tolerance, it makes a lot of sense to start with tol=0.0 and not tol=1e-12. I'd like to change this to make working with mir_eval as a library a little easier.

bmcfee commented 5 years ago

I think the reasoning here is that floating point comparisons are almost never that precise, and allowing tol=0 might suggest to a naive user that exact comparison is reliable. I don't see much point in having tolerance below machine epsilon, but I think I also don't completely understand your proposed use-case. Can you elaborate on that?

hendriks73 commented 5 years ago

I'm interested in graphing how tolerance affects system performance. Therefore I create graphs like this (axis labels have wrong unit):

screen shot 2019-01-08 at 16 35 27

And for that purpose it's convenient (not necessary) to have a value for tol=0.0.

I think the reasoning here is that floating point comparisons are almost never that precise, and allowing tol=0 might suggest to a naive user that exact comparison is reliable.

Do naive users really use mir_eval?

I don't see much point in having tolerance below machine epsilon

One could just as well argue that there is not point in having machine epsilon tolerance. Either seems arbitrary.

bmcfee commented 5 years ago

Do naive users really use mir_eval?

I count myself as naive most of the time, so yes. :grin: What I really mean here is that we should adhere to the principle of least surprise, and floating point can be full of surprises when we start asking for strict equivalence.

One could just as well argue that there is not point in having machine epsilon tolerance. Either seems arbitrary.

Epsilon as a tolerance doesn't seem arbitrary: it's explicitly acknowledging the machine's limitations in determining whether two numbers are the same given the representation.

At any rate, I don't feel too strongly about this point, but I prefer to err on the side of caution when dealing with floating point comparisons.

hendriks73 commented 5 years ago

Well, I don't think any harm would be done, if we allowed zero tolerance.

From the top of my head I cannot imagine any surprises this might produce. But I can tell you for sure that I was surprised when I realized that mir_eval does not allow zero tolerance for tempo.

It would allow me to (eventually, after release) use mir_eval code, which I'd love to do.

stefan-balke commented 5 years ago

It would allow me to (eventually, after release) use mir_eval code, which I'd love to do.

Well, you could still use it and maintain a separate fork. :)

Why not putting this as an extra flag? I can see a use case where you compare 2 reference annotations made with the same tool with each other.

Maybe this is the easiest fix: Instead of throwing a ValueError you could make it a Warning such as: "Wait a sec, you are using 0.0 tolerance...are you sure you want to do this?". Or add this as a separate warning and allow tolerance to be within [0,1].

bmcfee commented 5 years ago

Instead of throwing a ValueError you could make it a Warning such as: "Wait a sec, you are using 0.0 tolerance...are you sure you want to do this?". Or add this as a separate warning and allow tolerance to be within [0,1].

I like the idea of allowing exact 0, but warning that it's usually not what you want.

craffel commented 5 years ago

I don't actually see a compelling reason now to not allow 0. The argument "zero doesn't always mean zero" would apply to a lot more places than just the tolerance in tempo, unless I'm missing something.

bmcfee commented 5 years ago

I don't actually see a compelling reason now to not allow 0. The argument "zero doesn't always mean zero" would apply to a lot more places than just the tolerance in tempo, unless I'm missing something.

That's true, and I think it should be a general rule that floating point tolerances not be strictly 0.

Here's a simple example, adapted from this page:

In [12]: a = 0.15 + 0.15                                                                                                

In [13]: b = 0.1 + 0.2                                                                                                  

In [14]: abs(a - b) <= 0.0                                                                                              
Out[14]: False
craffel commented 5 years ago

Of course, but my point is that it's sort of infeasible to enforce this rule in general, so it seems inconsistent to enforce it here. Like, np.equal allows you to use floating point numbers (and doesn't warn you when you do), which is potentially dangerous.

bmcfee commented 5 years ago

Like, np.equal allows you to use floating point numbers (and doesn't warn you when you do), which is potentially dangerous.

Agreed; but we're not generally testing for equality here, we're testing for a value being below a threshold. In the case of the threshold = 0, those should be the same thing, but that's not always going to be the case because of precision errors. I think it's safer to be aware of that and expose it (even as a warning) than to blindly let it pass through and produce misleading results.

craffel commented 5 years ago

By why is zero special? Floating point issues can happen regardless of the specific value of tol.

lostanlen commented 5 years ago

@hendriks73:

Do naive users really use mir_eval?

🙋🏻‍♂️

@bmcfee just jinxed me at the example of 0.15 + 0.15 vs 0.1 + 0.2. I think that this issue is best resolved by a note in the docs, rather than by the addition of a new feature.

hendriks73 commented 5 years ago

By why is zero special? Floating point issues can happen regardless of the specific value of tol.

Exactly. Allowing machine epsilon but not 0 is entirely arbitrary.

The best argument so far against it IMHO is Brian's original argument:

allowing tol=0 might suggest to a naive user that exact comparison is reliable.

But I still believe that allowing zero causes no harm.

If we want to protect every user from misunderstanding floating point arithmetic, we probably need to disallow the float type entirely.

craffel commented 5 years ago

Wouldn't the "correct" solution here be to replace a <= b with np.logical_or(a < b, np.allclose(a, b))?

stefan-balke commented 5 years ago

Well, I think we are discussing things which we can't easily solve (or not at all?). IMHO, allowing "0" is fine plus making the user aware of it. Similar to numpy's "divide by zero encountered in divide"...

hendriks73 commented 5 years ago

I've created a simple PR (#301) with @stefan-balke 's suggestion.