corsis / clock

High-resolution clock functions: monotonic, realtime, cputime.
Other
58 stars 25 forks source link

Fix linking issues on 32-bit Windows #51

Closed RyanGlScott closed 6 years ago

RyanGlScott commented 6 years ago

Since clock compiles its bundled C code with -O3, gcc will optimize a division followed by a modulo calculation to a single divmod. On 32-bit Windows, this requires a library call found in libgcc, so it is necessary to link against it. (Not doing so caused #50.)

For more information, see the discussion at https://ghc.haskell.org/trac/ghc/ticket/15094.

RyanGlScott commented 6 years ago

On second thought, don't merge this as-is. @Mistuke points out in https://ghc.haskell.org/trac/ghc/ticket/15094#comment:7 that this impairs GHC's ability to decide between static and dynamic linking. He recommends simply disabling -O3 on the offending function. (I'm hoping he can chime in on how to do this.)

Mistuke commented 6 years ago

You can do it using an attribute. Either set the function to -O1 or disable the pass that does this transformation:

__attribute__((optimize("-fno-expensive-optimizations")))
void hs_clock_win32_gettime_monotonic(long long* t)

to disable the pass

__attribute__((optimize("-O1")))
void hs_clock_win32_gettime_monotonic(long long* t)

to set it to -O1 instead. Either should work and code-gen differences should be minimal. It's a simple function.

RyanGlScott commented 6 years ago

Good to know. I suppose we should also only set this attribute on 32-bit architectures, but what is a reliable way to test this. Unfortunately, it doesn't appear that GHC passes its i386_HOST_ARCH macro to compiled C files, so we can't simply #ifdef on that.

Mistuke commented 6 years ago

Ghc does pass on its macros. Otherwise base and win32 and others wouldn't work.

On Wed, May 2, 2018, 23:14 Ryan Scott notifications@github.com wrote:

Good to know. I suppose we should also only set this attribute on 32-bit architectures, but what is a reliable way to test this. Unfortunately, it doesn't appear that GHC passes its i386_HOST_ARCH macro to compiled C files, so we can't simply #ifdef on that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/corsis/clock/pull/51#issuecomment-386138228, or mute the thread https://github.com/notifications/unsubscribe-auth/ABH3KWffxlve2dbgbTlkEmZaea9gmkl9ks5tui_RgaJpZM4Tvl0f .

RyanGlScott commented 6 years ago

It does? For instance, this:

#ifdef x86_64_HOST_ARCH
fdsfdsa
#endif

Compiles on my 64-bit machine:

$ ghc -c bug.c -o bug.o
$ echo $?
0

But it shouldn't.

Mistuke commented 6 years ago

It should, as the RTS uses this all over the place. in any case I don't have the time to look into this right now, the standard platform macros can be used then defined(_WIN32) && !defined(_WIN64).

RyanGlScott commented 6 years ago

I'll close this and open a revised PR later.

Mistuke commented 6 years ago

Thanks!

On Thu, May 3, 2018, 00:36 Ryan Scott notifications@github.com wrote:

Closed #51 https://github.com/corsis/clock/pull/51.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/corsis/clock/pull/51#event-1606663958, or mute the thread https://github.com/notifications/unsubscribe-auth/ABH3Kfw1_S5xBKPufQi7XUAUI6P4GXx_ks5tukLrgaJpZM4Tvl0f .

Mistuke commented 6 years ago

Btw if you were looking for a generic x86 test you can find them here https://sourceforge.net/p/predef/wiki/Home/

On Thu, May 3, 2018, 00:38 Phyx lonetiger@gmail.com wrote:

Thanks!

On Thu, May 3, 2018, 00:36 Ryan Scott notifications@github.com wrote:

Closed #51 https://github.com/corsis/clock/pull/51.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/corsis/clock/pull/51#event-1606663958, or mute the thread https://github.com/notifications/unsubscribe-auth/ABH3Kfw1_S5xBKPufQi7XUAUI6P4GXx_ks5tukLrgaJpZM4Tvl0f .

RyanGlScott commented 6 years ago

Wow. It turns out that this issue isn't present on the master branch of clock, after commit ecfe4e04b40dd2c0fdbe6c512939b0b99e7df27e...

Mistuke commented 6 years ago

The issue is just latent in master. The truth is division by zero is undefined in the C standard. [1]:

C99 6.5.5p5 - The result of the / operator is the quotient from the division of the first operand
by the second; the result of the % operator is the remainder. In both operations, if the value
of the second operand is zero, the behavior is undefined.

By doing an unchecked division in hs_clock_win32_gettime_monotonic the compiler is free to choose the behavior it wants. In this case, the function costing probably determined that it's faster to calculate the result using complex division.

The additional code is probably swaying it to the other direction, the complex division is probably no longer worth it. But there's little point reasoning about UB. It's all implementation dependent and can change at any time.

[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

RyanGlScott commented 6 years ago

Take two here: https://github.com/corsis/clock/pull/52