corsis / clock

High-resolution clock functions: monotonic, realtime, cputime.
Other
58 stars 25 forks source link

Add GHC.Clock.getMonotonicTimeNSec to benchmarks #56

Closed sjakobi closed 5 years ago

sjakobi commented 5 years ago

Sample results from my machine:

benchmarking getTime/Monotonic
time                 108.2 ns   (108.0 ns .. 108.5 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 108.1 ns   (108.0 ns .. 108.5 ns)
std dev              727.8 ps   (371.1 ps .. 1.348 ns)

benchmarking getTime/Realtime
time                 113.4 ns   (113.2 ns .. 113.7 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 113.4 ns   (113.3 ns .. 113.7 ns)
std dev              635.2 ps   (471.5 ps .. 917.1 ps)

benchmarking getTime/ProcessCPUTime
time                 378.6 ns   (377.8 ns .. 379.4 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 378.2 ns   (377.7 ns .. 378.8 ns)
std dev              1.944 ns   (1.546 ns .. 2.417 ns)

benchmarking getTime/ThreadCPUTime
time                 380.1 ns   (378.9 ns .. 381.8 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 379.8 ns   (379.2 ns .. 380.9 ns)
std dev              2.612 ns   (1.622 ns .. 4.126 ns)

benchmarking getTime/MonotonicRaw
time                 340.3 ns   (338.8 ns .. 342.2 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 339.5 ns   (338.8 ns .. 340.8 ns)
std dev              3.000 ns   (1.839 ns .. 4.316 ns)

benchmarking getTime/Boottime
time                 339.8 ns   (338.6 ns .. 341.2 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 339.5 ns   (338.7 ns .. 340.8 ns)
std dev              3.567 ns   (2.472 ns .. 5.149 ns)

benchmarking getTime/MonotonicCoarse
time                 105.3 ns   (105.0 ns .. 105.6 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 105.2 ns   (105.0 ns .. 105.6 ns)
std dev              883.6 ps   (462.1 ps .. 1.618 ns)

benchmarking getTime/RealtimeCoarse
time                 106.2 ns   (105.6 ns .. 107.0 ns)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 105.9 ns   (105.7 ns .. 106.6 ns)
std dev              1.291 ns   (654.6 ps .. 2.410 ns)
variance introduced by outliers: 12% (moderately inflated)

benchmarking GHC.Clock.getMonotonicTimeNSec
time                 16.57 ns   (16.54 ns .. 16.60 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 16.56 ns   (16.54 ns .. 16.60 ns)
std dev              86.29 ps   (50.25 ps .. 134.3 ps)
harendra-kumar commented 5 years ago

It seems both clock and GHC are using exactly the same system call i.e. clock_gettime with a CLOCK_MONOTONIC clock type to get the time.

The difference may be due to the setup code to get the results. The clock package allocates memory using alloca:

getTime clk = allocaAndPeek $! clock_gettime $! clockToConst clk

While GHC is using StgWord64:

StgWord64 getMonotonicNSec(void)
{
#if defined(HAVE_CLOCK_GETTIME)
    struct timespec ts;
    int res;

    res = clock_gettime(CLOCK_ID, &ts);
    if (res != 0) {
        sysErrorBelch("clock_gettime");
        stg_exit(EXIT_FAILURE);
    }
    return (StgWord64)ts.tv_sec * 1000000000 +
           (StgWord64)ts.tv_nsec;

StgWord64 may be more efficient than alloca. Though, it should make any practical difference only if we use the getTime call very very often.