rxi / fe

A tiny, embeddable language implemented in ANSI C
MIT License
1.3k stars 81 forks source link

`is` is sometimes incorrect because `==` does not work for floating-point comparison in C #29

Open noncombatant opened 1 year ago

noncombatant commented 1 year ago

Thanks for publishing fe! It's very cool. Turning on more compiler warnings (I enjoy strange hobbies) I found this:

// src/fe.c:203:49: warning: comparing floating point with == or != is unsafe [-Wfloat-equal]
//  if (type(a) == FE_TNUMBER) { return number(a) == number(b); }
//                                      ~~~~~~~~~ ^  ~~~~~~~~~

I found that it is indeed a practical problem in the REPL: I have 2 expressions that print equal but for which is returns nil (correctly).

> (= x (/ 10.2 3))
nil
> x
3.4
> (= y (/ 30.6 9))
nil
> y
3.4
> (is x y)
nil

I propose 2 possible fixes in the C file below:

// src/fe.c:203:49: warning: comparing floating point with == or != is unsafe [-Wfloat-equal]
//  if (type(a) == FE_TNUMBER) { return number(a) == number(b); }
//                                      ~~~~~~~~~ ^  ~~~~~~~~~
//
// You can see with Python 3 that there is a tiny bit of floating point error:
//
// >>> 10.2 / 3
// 3.4
// >>> 30.6 / 9
// 3.4000000000000004
// >>> 3 * 3.4
// 10.2
// >>> 9 * 3.4
// 30.599999999999998
// >>> (10.2 / 3) == (30.6 / 9)
// False
//
// In Python, we can deal with this using `math.isclose`.
//
// fe prints them the same (format string `%.7g`), but does equality on their
// bitwise representations. So you end up with an error you can't see in the
// REPL, nor fix using library functions.
//
// % ./fe
// > (= x (/ 10.2 3))
// nil
// > x
// 3.4
// > (= y (/ 30.6 9))
// nil
// > y
// 3.4
// > (is x y)
// nil
//
// Changing fe to use `%.7f` at least makes the error visible in the REPL. I see
// 2 paths to solving this: (1) provide `isclose` (`double_nearly_equal` in the
// C below) in the standard library; or (2) make `is` use `double_nearly_equal`
// for number objects. With either option, using `%.7f` might also help people
// see the issue.

#include <float.h>
#include <math.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>

#define MIN(a, b) ((a) < (b) ? (a) : (b))

// Translated from [the original
// Java](https://floating-point-gui.de/errors/comparison/). If you want to use
// `float`, use `fabsf` and `FLT_*`. If you want to use `long double`, use
// `fabsl` and `LDBL_*`.
//
// See also
// https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/.
static bool double_nearly_equal(double a, double b, double epsilon) {
  const double absA = fabs(a);
  const double absB = fabs(b);
  const double diff = fabs(a - b);
// It's OK to turn this warning off for this limited section of code, because
// it's in the context of handling tiny errors.
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wfloat-equal"
  if (a == b) {
    // Special case; handles infinities.
    return true;
  } else if (a == 0 || b == 0 || (absA + absB < DBL_MIN)) {
#pragma clang diagnostic pop
    // Either a or b is zero, or both are extremely close to it. Relative error
    // is less meaningful here.
    return diff < (epsilon * DBL_MIN);
  }
  // Use relative error.
  return diff / MIN((absA + absB), DBL_MAX) < epsilon;
}

int main(int count, char* arguments[]) {
  if (count != 5) {
    fprintf(stderr, "Usage: ./float_nearly_equal 10.2 3 30.6 9\n");
    return -1;
  }

  const double a = strtod(arguments[1], NULL);
  const double b = strtod(arguments[2], NULL);
  const double c = strtod(arguments[3], NULL);
  const double d = strtod(arguments[4], NULL);
  const double r1 = a / b;
  const double r2 = c / d;
  printf("%.20f / %.20f (%.20f) == %.20f / %.20f (%.20f) : %d\n", a, b, r1, c, d, r2, r1 == r2);
  printf("%.20f / %.20f double_nearly_equal %.20f / %.20f : %d\n", a, b, c, d,
         double_nearly_equal(a / b, c / d, DBL_EPSILON));
}
ooichu commented 1 year ago

The comparison of numbers is usually between integers, otherwise you have to account for float-point errors, which is expected behavior. I think it's better to just implement a isclose like function and register it with fe_cfunc() and fe_set() if you really need it.

noncombatant commented 1 year ago

Yes, I know that floating-point errors are expected. The trouble is that there is no way to detect or deal with them in terms of fe source code and in the REPL. I am proposing to resolve that in either of 2 ways.

(Regarding integers, the use of single-precision float in fe means that the usable range of integers is only 23 bits, which is fairly small. double provides 53 bits of integers.)