Closed RyanGlScott closed 6 years ago
On second thought, don't merge this as-is. @Mistuke points out in https://ghc.haskell.org/trac/ghc/ticket/15094#comment:7 that this impairs GHC's ability to decide between static and dynamic linking. He recommends simply disabling -O3
on the offending function. (I'm hoping he can chime in on how to do this.)
You can do it using an attribute. Either set the function to -O1
or disable the pass that does this transformation:
__attribute__((optimize("-fno-expensive-optimizations")))
void hs_clock_win32_gettime_monotonic(long long* t)
to disable the pass
__attribute__((optimize("-O1")))
void hs_clock_win32_gettime_monotonic(long long* t)
to set it to -O1
instead. Either should work and code-gen differences should be minimal. It's a simple function.
Good to know. I suppose we should also only set this attribute on 32-bit architectures, but what is a reliable way to test this. Unfortunately, it doesn't appear that GHC passes its i386_HOST_ARCH
macro to compiled C files, so we can't simply #ifdef
on that.
Ghc does pass on its macros. Otherwise base and win32 and others wouldn't work.
On Wed, May 2, 2018, 23:14 Ryan Scott notifications@github.com wrote:
Good to know. I suppose we should also only set this attribute on 32-bit architectures, but what is a reliable way to test this. Unfortunately, it doesn't appear that GHC passes its i386_HOST_ARCH macro to compiled C files, so we can't simply #ifdef on that.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/corsis/clock/pull/51#issuecomment-386138228, or mute the thread https://github.com/notifications/unsubscribe-auth/ABH3KWffxlve2dbgbTlkEmZaea9gmkl9ks5tui_RgaJpZM4Tvl0f .
It does? For instance, this:
#ifdef x86_64_HOST_ARCH
fdsfdsa
#endif
Compiles on my 64-bit machine:
$ ghc -c bug.c -o bug.o
$ echo $?
0
But it shouldn't.
It should, as the RTS uses this all over the place. in any case I don't have the time to look into this right now, the standard platform macros can be used then defined(_WIN32) && !defined(_WIN64)
.
I'll close this and open a revised PR later.
Thanks!
On Thu, May 3, 2018, 00:36 Ryan Scott notifications@github.com wrote:
Closed #51 https://github.com/corsis/clock/pull/51.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/corsis/clock/pull/51#event-1606663958, or mute the thread https://github.com/notifications/unsubscribe-auth/ABH3Kfw1_S5xBKPufQi7XUAUI6P4GXx_ks5tukLrgaJpZM4Tvl0f .
Btw if you were looking for a generic x86 test you can find them here https://sourceforge.net/p/predef/wiki/Home/
On Thu, May 3, 2018, 00:38 Phyx lonetiger@gmail.com wrote:
Thanks!
On Thu, May 3, 2018, 00:36 Ryan Scott notifications@github.com wrote:
Closed #51 https://github.com/corsis/clock/pull/51.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/corsis/clock/pull/51#event-1606663958, or mute the thread https://github.com/notifications/unsubscribe-auth/ABH3Kfw1_S5xBKPufQi7XUAUI6P4GXx_ks5tukLrgaJpZM4Tvl0f .
Wow. It turns out that this issue isn't present on the master
branch of clock
, after commit ecfe4e04b40dd2c0fdbe6c512939b0b99e7df27e...
The issue is just latent in master
. The truth is division by zero
is undefined
in the C standard
. [1]:
C99 6.5.5p5 - The result of the / operator is the quotient from the division of the first operand
by the second; the result of the % operator is the remainder. In both operations, if the value
of the second operand is zero, the behavior is undefined.
By doing an unchecked division in hs_clock_win32_gettime_monotonic
the compiler is free to
choose the behavior it wants. In this case, the function costing probably determined that it's faster
to calculate the result using complex division
.
The additional code is probably swaying it to the other direction, the complex division
is probably no
longer worth it. But there's little point reasoning about UB. It's all implementation dependent and can change at any time.
[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
Take two here: https://github.com/corsis/clock/pull/52
Since
clock
compiles its bundled C code with-O3
,gcc
will optimize a division followed by a modulo calculation to a singledivmod
. On 32-bit Windows, this requires a library call found inlibgcc
, so it is necessary to link against it. (Not doing so caused #50.)For more information, see the discussion at https://ghc.haskell.org/trac/ghc/ticket/15094.