Floating-point comparisons in innfeed

Julien-Elie commented 2 years ago

The following unsafe floating-point comparisons are shown by gcc with -Wfloat-equal in innfeed/host.c:

  577 |     if (h->params->dynBacklogFilter != oldBacklogFilter)
 3233 |         if (totalsize == 0)
 3819 |     } else if (v != NULL && low != -DBL_MAX && v->v.real_val < low) {
 3824 |     } else if (v != NULL && high != DBL_MAX && v->v.real_val > high) {

What's the best way to fix them?

Should a function like the below one be used? Any other better approach?

--- a/innfeed/host.c
+++ b/innfeed/host.c
@@ -254,6 +254,16 @@ typedef struct host_holder_s {
     struct host_holder_s *next;
 } * HostHolder;

+/* Returns true if the two double arguments are equal, false otherwise.
+ * Use precision 10e-5.
+ */
+static bool
+same_double(double d1, double d2)
+{
+    double precision = 0.00001;
+    return ((d1 - precision) < d2) && ((d1 + precision) > d2);
+}
+

 /* These numbers are as above, but for all hosts over
    the life of the process. */
@@ -574,7 +584,7 @@ hostReconfigure(Host h, HostParams params)
     /* If the backlog filter value has changed, reset the
      * filter as the value therein will be screwy
      */
-    if (h->params->dynBacklogFilter != oldBacklogFilter)
+    if (!same_double(h->params->dynBacklogFilter, oldBacklogFilter))
         h->backlogFilter = ((h->params->dynBacklogLowWaterMark
                              + h->params->dynBacklogHighWaterMark)
                             / 200.0 / (1.0 - h->params->dynBacklogFilter));
@@ -3230,7 +3240,7 @@ hostLogStatus(void)
             sec = 1;
         offered = procArtsOffered ? procArtsOffered : 1;
         totalsize = procArtsSizeAccepted + procArtsSizeRejected;
-        if (totalsize == 0)
+        if (same_double(totalsize, 0.))
             totalsize = 1.;

         fprintf(fp, "   offered: %-5ld\t%6.2f art/s\n", procArtsOffered,
@@ -3816,12 +3826,14 @@ validateReal(FILE *fp, const char *name, double low, double high, int required,
         logOrPrint(LOG_ERR, fp,
                    "ME config: value of %s is not a floating point number",
                    name);
-    } else if (v != NULL && low != -DBL_MAX && v->v.real_val < low) {
+    } else if (v != NULL && !same_double(low, -DBL_MAX)
+               && v->v.real_val < low) {
         logOrPrint(LOG_ERR, fp,
                    "ME config: value of %s (%f) is lower than minimum of %f",
                    name, v->v.real_val, low);
         v->v.real_val = setval;
-    } else if (v != NULL && high != DBL_MAX && v->v.real_val > high) {
+    } else if (v != NULL && !same_double(high, DBL_MAX)
+               && v->v.real_val > high) {
         logOrPrint(LOG_ERR, fp,
                    "ME config: value of %s (%f) is higher than maximum of %f",
                    name, v->v.real_val, high);

rra commented 2 years ago

This is an explicit comparison against DBL_MAX, so I don't think the goal here is to do a range comparison. It looks like they're used as special sentinel values for some configuration parameters (although I think only as a high value, not -DBL_MAX as a low value). Since it's a sentinel, I think it has to be an equality comparison. I'm a little surprised that the warning doesn't special-case a comparison with DBL_MAX.

Julien-Elie commented 2 years ago

So the range comparison should be done for the first 2 checks h->params->dynBacklogFilter != oldBacklogFilter and totalsize == 0. whereas the other 2 should be changed to inequality comparisons: low > -DBL_MAX and high < DBL_MAX if I understand well. Yes, you're right that these last 2 checks are "sentinel" ones.

rra commented 2 years ago

I think the current logic is correct in all cases, although it's going to be hard for the compiler to know that. The first test is explicitly a test for whether the value has changed (such as by loading a new configuration). If it's changed at all, it's correct to execute that logic, and floating-point precision should be irrelevant there.

The second check is against 0, and if it's anything other than 0 the check shouldn't be true.

The inequality comparisons will work for the other two, I guess, although since those are sentinel values, the equality test is equally correct and will produce the same results, and seems a bit clearer to me.

(I'm not entirely sure why doubles are used for these values instead of long long, although I haven't done the math to see if these values really may be larger than long long can hold.)

Julien-Elie commented 2 years ago

totalsize is a sum of article sizes (accepted and rejected), and is an integer. The artSize() function returns an int, but converted to double for the sum. I really doubt totalsize will exceed the value of long long in bytes (more than 1000 Po)... So it may be converted to unsigned long long (or unsigned long if that type does not exist). All manipulated sizes could be switched to that.

However, dynBacklogFilter is a floating-point value between 0 and 1, so unfortunately cannot be long long. As well as we have other floating-point values to deal with for other parameters (percentages between 0 and 100 essentially). Do you think we should get rid of them all, and use long long by for instance multiplying them by 10^6 and taking their integer portion (so basically rounding them to 6 decimals)? It may seem odd in the code to manipulate them this way. Unless there is a better solution? (of course, just not enabling -Wfloat-equal is fine too!)

Julien-Elie commented 2 years ago

Let's just silent the warning for this file (with 4 correct and working uses of floating-point comparisons). Enable the warning elsewhere in the code.

rra commented 2 years ago

Yeah, agreed with that approach. There needs to be some way to skip that warning with sentinel values; seems like a bit of a gcc bug that there isn't.

InterNetNews / inn

Floating-point comparisons in innfeed #212