Best way to compare floats?

LatticeQCD / SIMULATeQCD

SIMULATeQCD is a multi-GPU Lattice QCD framework that makes it easy for physicists to implement lattice QCD formulas while still providing competitive performance.

https://latticeqcd.github.io/SIMULATeQCD/

MIT License

30 stars 12 forks source link

Best way to compare floats? #29

Open clarkedavida opened 2 years ago

clarkedavida commented 2 years ago

In gcomplex.h there is a TODO about float comparisons in the following lines of code:

        /// Note: You should not use this operator to compare with zero, because cmp_rel breaks down in that case.
        __host__ __device__ bool operator==( const GPUcomplex& op )
        {
        ////TODO:: THAT PRECISION HAS TO BE CHANGED!!
            return (cmp_rel(this->c.x, op.c.x, 1.e-6, 1.e-6) && cmp_rel(this->c.y, op.c.y, 1.e-6, 1.e-6));
    //  return (isApproximatelyEqual(this->c.x, op.c.x, 1.e-14) && isApproximatelyEqual(this->c.y, op.c.y, 1.e-14));
        }

This is not a trivial issue, see e.g. https://stackoverflow.com/questions/17333/what-is-the-most-effective-way-for-float-and-double-comparison

We should agree on one way to do compare floats, maybe using std::numeric_limit or something.

goracle commented 2 years ago

A few comments, to carry over the discussion on #84:

I think having some overflow guard like auto norm = std::min((std::abs(a) + std::abs(b)), std::numeric_limits::max()); (from the linked stackoverflow answer) wouldn't be a bad check, unless there's some reason to think that the runtime environment will catch overflows.

Given that we will have something like FLT_MIN as the precision floor in cmp_rel (adapted to floatT) why give the user of cmp_rel a prec option at all? I think this is what @clarkedavida is basically suggesting.

clarkedavida commented 2 years ago

There are a couple of places where I think we need the precision option on cmp_rel.

Sometimes we have cases where we compare some result of SIMULATeQCD against some test values, but we have SIMULATeQCD in higher precision than the test values. I remember this happened when I was porting over the SU(3) heatbath and overrelaxation methods. I wanted to implement a link-by-link test, but the old code only gave output in single precision.

I think there are also cases where the precision that can be achieved depends on the algorithm being used. For example parts of an algorithm may be hard-coded as single (maybe for some performance reason), which in general may prevent agreement up to the double floor when the rest of the algorithm runs in double.

goracle commented 2 years ago

cmp_rel as written now only allows comparisons between two variables of the same type T. I think there could be a version of cmp_rel like you are describing between two different types, but then where should we set the precision floor? If type A has precision greater than type B, then setting the precision floor at A (and converting B to A by adding on fake significant figures of zero on B), we may end up with a greater relative difference than is actually the case if the B type variable were A type the whole time. If we set the precision floor at B, we do discard some of A's significant figures, but the result is a comparison we know is good up to B's precision. This latter option seems the safer of the two options, and also it seems to follow the rules for significant figures.

I think this could be implemented via a comparison between FLT_MIN of A and FLT_MIN of B. if FLT_MIN of B is greater than FLT_MIN of A we know B is less precise, and we do an internal conversion of A to B before we run the usual comparison. This could be implemented as an optional template parameter T2 with default value T.

On the other hand, perhaps the user should not be allowed to blindly (since the return value is just a bool) discard A's sigfigs. Perhaps it would be better if we forced the user to convert A to B before the comparison so the user is aware at what precision/significance the comparison is being performed.

Either way, it seems like rel can/should be set via the FLT_MIN of the input types, as opposed to a user specified parameter.

clarkedavida commented 2 years ago

That solution makes sense, and it would address the first situation I gave.

I'm not yet sure if it would address the second situation. Like if I compute a bunch of stuff in float, then later use that float to compute a double, I won't get the same answer every time I do that computation, at least not up to the double floor. (At least that's my understanding.) I didn't implement our RHMC, but taking a not very careful glance at it, I remember seeing some things calculated in different precision in the integrator...

A more trivial example the proposed solution wouldn't address would be this one: Some of our tests take control values that were printed out to screen by some older code. (One example is the polyakov loop correlator code, which I ported from some code written by Olaf for his master thesis.) If the old code was run in double, but the printed output is taken as control, then you are in a situation where you are comparing two doubles, but what is being computed by the test will not agree with the control at the double floor. This could be fixed by recompiling and rerunning the old code, taking the exact values, etc., but this was done enough times by enough people that I'm not sure we could get everyone to coordinate on that...

If we keep cmp_rel, but with its prec set to some default value that is appropriate based on the types being compared, we prevent users from running into the problem you ran into, while still allowing the option to have a less stringent comparison in some cases.

What do you all think?

goracle commented 2 years ago

Never mind, I understand now why that option exists; your second example is clear.

On the first example, it seems like you start with a float's worth of sigfigs, so propagating that to the end you should only use cmp_rel with T=float. The digits near the double floor should not be significant, and so might vary depending on the implementation. Of course, if it is somehow losing significant digits beyond that point (all the way into the float) during the computation, that would indeed be another reason to have a user specified prec.

I mistakenly said rel should be FLT_MIN in my above comment, but I meant prec.

goracle commented 2 years ago

I want to comment on one of Rasmus' proposed solutions and then I'll stop talking for awhile:

return (2.0abs(a-b) < rel(abs(a)+abs(b))+1e-37 && abs(a-b) < prec) (1)

if 1e-37 is instead set to the max of FLT_MIN and prec, I think it captures what we want:

return (2.0abs(a-b) < rel(abs(a)+abs(b))+max(FLT_MIN, prec)) (2)

In eq. 1 above, It seems like we could have a diff that is relatively small (compared to a+b), but much larger than prec (which looks at absolute value), in which case we'd get false, even though we asked for a relative comparison in the first place. The way it is set up now, prec is the max absolute difference between a and b, regardless of the size of a and b. This sets somewhat of a ceiling on the size of numbers we can compare with cmp_rel. While prec=1e-6 does likely make for a stringent precision ceiling, we should probably not use one near 1e-6 for default. Instead, we should look towards FLT_MAX, perhaps add on to eq. 2

|| abs(a)+abs(b) > FLT_MAX

to get

return (2.0abs(a-b) < rel(abs(a)+abs(b))+max(FLT_MIN, prec)) || abs(a)+abs(b) < FLT_MAX (3)

in keeping with the idea of returning true when we start to face precision loss.

Of course, again, if the user has reason (which I'd be interested to know) to choose a different precision ceiling, we could have parameters for both precision floor and precision ceiling. However, the point of doing a relative comparison in the first place is to allow us to ignore how the size of big numbers, so I don't know when this would be useful.

edit: grammar