bitwiseworks / libc

LIBC Next (kLIBC fork)
9 stars 4 forks source link

nearbyint returns `nan` for any argument #87

Closed dmik closed 4 years ago

dmik commented 4 years ago

nearbyint is a modern rounding function in POSIX C.

This simple program:

#include <stdio.h>
#include <math.h>
#include <fenv.h>

int main()
{
  double d1 = 127.00000;
  printf("%f %f \n", d1, nearbyint(d1));
  return 0;
}

outputs 127.000000 nan instead of 127 127 for some reason.

Note that any other double instead of 127 results in nan as well.

Spotted here: https://github.com/bitwiseworks/qtwebengine-chromium-os2/issues/21#issuecomment-691239455.

dmik commented 4 years ago

Note that only nearbyint is broken, nearbyintl (which is written in assembly) and rint work as expected. The strange thing is that rint is what nearbyint internally uses. Needs deeper debugging.

dmik commented 4 years ago

This problem is rather strange. Looks like the complier/optimizer/alignment issue since if I remove fesetenv that is done after rint (or add an additional printf there for debugging) all works. fesetenv is an inlined function that also includes a fldenv instruction and even the SSE instruction. Need to debug it more, this might have other consequences.

dmik commented 4 years ago

Well, I know what it is. The fesetenv code does a FLDENV and this basically ruins the result of rint because it is stored in ST(0) (top of the FPU stack) but when calling fgetenv (and FSTENV within) ST(0) contained NaN and this value gets restored by FLDENV, overwriting the result of rint. It should be easily fixed by adding all FPU registers to the clobber list of inline assembly (see here]. But constant f representing all FPU registers according to this for some reason causes the compiler to barf on an invalid register name. I'm puzzled as everything else works...

dmik commented 4 years ago

OK, what works is naming all FPU registers like this: "st", "st(1)", "st(2)", "st(3)", "st(4)", "st(5)", "st(6)", "st(7)". Then the problem goes away and I get the correct result.

Interesting enough, FreeBSD (where this code originates from) came up with just the same solution, check: https://github.com/freebsd/freebsd/blob/master/lib/msun/x86/fenv.h (__fldenvx definition and its usage in fesetenv).

I guess I will go the FreeBSD way as it must have been well-tested over the years. I will pick up some relevant changes from there.