Problem with gcc 6 and i386

sergiopasra commented 8 years ago

I'm trying to compile erfa for new Fedora that comes with gcc 6. This is causing a problem in one of the tests. In particular I see this problem in i386 (in x86_64 the test passes)

eraFk52h failed: rv want -7.6000000940000251859 got -7.6000000939851357629 (1/5.1e+11) t_erfa_c validation failed!

The radial velocity is computed here

https://github.com/liberfa/erfa/blob/dc8292fd26a9cdfc504e210d912e52c3f73a9185/src/pvstar.c#L148

astrofrog commented 8 years ago

@sergiopasra - just to check, does this depend on the optimization level you compile with?

sergiopasra commented 8 years ago

@astrofrog Yes, with -O0 and -O1 it works, with -O2, -O3 it doesn't (only in i386, x86_64 works with both)

astrofrog commented 8 years ago

@sergiopasra - do you see the same issue when compiling SOFA? If so, we'll need to report it upstream.

sergiopasra commented 8 years ago

@astrofrog I obtain the same if I change the compilation flags of sofa from "-pedantic -Wall -W -O" to "-pedantic -Wall -W -O2" I'm not sure that they are going to consider this a bug

timj commented 8 years ago

How do we know this isn't a GCC6 compiler bug? How much does the value change between GCC5 and GCC6? Does reordering some expressions fix it? How much does the tolerance have to change to make the test pass (not very much)? What is the actual required precision of this calculation?

olebole commented 8 years ago

This also affects Debian: Bug 835105.

olebole commented 8 years ago

I played around a bit with the different optimization options: The source code in question is src/starpv.c, and the optimization option is -fcaller-saves. Applying -O2 -fno-caller-saves to gcc -c starpv.c works well, -O1 -fcaller-saves shows the failure.

timj commented 8 years ago

So this looks like a GCC compiler bug then.

olebole commented 8 years ago

Agreed. I could extract the following code from starpv.c:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

double f(double x) {
  return sqrt(1.0 - x) - 1.0;
}

double g(double x) {
  double res = f(x);
  printf("r =%.20g\n", res);
  return res;
}

int main(void) {
  double x = 3.1740879016271733482e-09;
  double r1 = f(x);
  double r2 = g(x);
  printf("r1=%.20g r2=%.20g diff=%20g\n", r1, r2, r2-r1);
  exit(0);
}

This code should print the same number for r1 and r2, and a diff of 0. When compiled with gcc -O2 -o m m.c -lm, I get the result

r =-1.5870439520936432953e-09
r1=-1.5870439407095204842e-09 r2=-1.5870439520936432953e-09 diff=        -1.13841e-17

when I add -fno-caller-saves, everything is fine. The numbers are taken from the erfa test case.

olebole commented 8 years ago

FYI: I added a gcc bug for this issue.

juliantaylor commented 8 years ago

not really a gcc bug, i386 does not have reliable floating point math due to the 80 bit x87 fpu, if you need it try the -fexcess-precision=standard or -ffloat-store or -mfpmath=sse see https://gcc.gnu.org/wiki/x87note (or the super old gcc issue 323)