time-rs / time

The most used Rust library for date and time handling.
https://time-rs.github.io
Apache License 2.0
1.13k stars 281 forks source link

Change PartialEq, PartialOrd, Eq, Ord and Hash on Time for much more efficient versions #598

Closed krtab closed 1 year ago

krtab commented 1 year ago

This PR is split in three commits:

  1. Add a benchmarking function that actually measure execution time of the comparison, as others are drowned in noise and/or optimized away IMO.
  2. Adds the implementations themselves. :warning: This adds unsafe code.
  3. Adds a more controversial optimization, see the message commit for pros and cons. This one can be removed without changing anything to the benchmarks and assembly below but may prevent code to be as efficient in more complex workloads.

Benchmark

Time: vec_sort          time:   [96.691 µs 96.966 µs 97.262 µs]
                        change: [-54.795% -54.630% -54.470%] (p = 0.00 < 0.00)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  7 (7.00%) high mild
  1 (1.00%) high severe

Generated assembly

pub fn gt(x: &Time, y: &Time) -> bool {
    x > y
}

Before:

    movzx eax, byte ptr [rsi + 4]
    cmp byte ptr [rdi + 4], al
    jae .LBB357_2
    xor eax, eax
    ret

.LBB357_2:
    mov al, 1
    jne .LBB357_10
    movzx ecx, byte ptr [rsi + 5]
    cmp byte ptr [rdi + 5], cl
    jae .LBB357_5
    xor eax, eax
    ret

.LBB357_5:
    jne .LBB357_10
    movzx ecx, byte ptr [rsi + 6]
    cmp byte ptr [rdi + 6], cl
    jae .LBB357_8
    xor eax, eax
    ret

.LBB357_8:
    jne .LBB357_10
    mov eax, dword ptr [rdi]
    cmp eax, dword ptr [rsi]
    seta al

.LBB357_10:
    ret

After:

        mov rax, qword ptr [rdi]
        cmp rax, qword ptr [rsi]
        seta al
        ret
codecov[bot] commented 1 year ago

Codecov Report

Merging #598 (95a36b8) into main (cbf1349) will decrease coverage by 0.0%. The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main    #598     +/-   ##
=======================================
- Coverage   95.8%   95.8%   -0.0%     
=======================================
  Files         79      79             
  Lines       8927    8899     -28     
=======================================
- Hits        8552    8524     -28     
  Misses       375     375             
Impacted Files Coverage Δ
time/src/time.rs 100.0% <100.0%> (ø)

... and 8 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

krtab commented 1 year ago

Note that I only tested this on little-endian x86_64, so I didn't test the big endian code.

krtab commented 1 year ago

I've separated the PR in two.

In this one I don't force align to 8. On x64 it seems to be useless anyway

Time: sort              time:   [102.54 µs 103.97 µs 105.53 µs]
                        change: [-53.341% -52.613% -51.968%] (p = 0.00 < 0.00)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

Time: sort_unaligned    time:   [103.88 µs 105.43 µs 107.15 µs]
                        change: [-52.691% -52.015% -51.322%] (p = 0.00 < 0.00)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

I've put the force align in another PR: #599