Closed dmik closed 4 years ago
Note that only nearbyint
is broken, nearbyintl
(which is written in assembly) and rint
work as expected. The strange thing is that rint
is what nearbyint
internally uses. Needs deeper debugging.
This problem is rather strange. Looks like the complier/optimizer/alignment issue since if I remove fesetenv
that is done after rint
(or add an additional printf there for debugging) all works. fesetenv
is an inlined function that also includes a fldenv
instruction and even the SSE instruction. Need to debug it more, this might have other consequences.
Well, I know what it is. The fesetenv
code does a FLDENV and this basically ruins the result of rint
because it is stored in ST(0) (top of the FPU stack) but when calling fgetenv
(and FSTENV within) ST(0) contained NaN and this value gets restored by FLDENV, overwriting the result of rint
. It should be easily fixed by adding all FPU registers to the clobber list of inline assembly (see here]. But constant f
representing all FPU registers according to this for some reason causes the compiler to barf on an invalid register name. I'm puzzled as everything else works...
OK, what works is naming all FPU registers like this: "st", "st(1)", "st(2)", "st(3)", "st(4)", "st(5)", "st(6)", "st(7)"
. Then the problem goes away and I get the correct result.
Interesting enough, FreeBSD (where this code originates from) came up with just the same solution, check: https://github.com/freebsd/freebsd/blob/master/lib/msun/x86/fenv.h (__fldenvx
definition and its usage in fesetenv
).
I guess I will go the FreeBSD way as it must have been well-tested over the years. I will pick up some relevant changes from there.
nearbyint
is a modern rounding function in POSIX C.This simple program:
outputs
127.000000 nan
instead of127 127
for some reason.Note that any other double instead of 127 results in
nan
as well.Spotted here: https://github.com/bitwiseworks/qtwebengine-chromium-os2/issues/21#issuecomment-691239455.