tailhook / humantime

A parser and formatter for std::time::{SystemTime, Duration}
Apache License 2.0
283 stars 34 forks source link

two_digits: use `char::is_digit` and `char::to_digit` instead of direct byte arithmetic #22

Closed vrmiguel closed 2 years ago

tailhook commented 2 years ago

What is the motivation of the PR?

I think this can reduce the performance. Judging from the assembly, it's possible to use to_digit and get similar performance: https://play.rust-lang.org/?version=nightly&mode=release&edition=2018&gist=2c6bb5a4bf39e350bd6b3c0e2267db15 (if you compile to ASM, note two jump ja instructions in your version which usually are slower than pure math). Although I didn't do actual benchmarks.

vrmiguel commented 2 years ago

Hey there! I just made this PR because I felt it'd more idiomatic to use std in this case

I got curious as to how they'd perform so I ran a benchmark using your code and Criterion (available here)

Run 1

two_digits              time:   [9.3652 ns 9.3946 ns 9.4262 ns]                        
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

two_digits2             time:   [9.3178 ns 9.3346 ns 9.3542 ns]                         
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild

two_digits_old          time:   [9.3487 ns 9.3764 ns 9.4046 ns]                            
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

Run 2

two_digits              time:   [8.6810 ns 8.7017 ns 8.7244 ns]                        
                        change: [-8.0064% -7.4658% -6.7391%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

two_digits2             time:   [8.6402 ns 8.7257 ns 8.8351 ns]                         
                        change: [-7.9803% -7.0310% -5.8307%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) high mild
  7 (7.00%) high severe

two_digits_old          time:   [9.3039 ns 9.3380 ns 9.3688 ns]                            
                        change: [-3.3890% -2.6784% -1.9324%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

I find your implementation of two_digits2 much preferable than what I've done, to be honest, or perhaps

fn two_digits3(b1: u8, b2: u8) -> Result<u64, Error> {
    let chars_to_digit = |a: char, b: char| -> Option<u64> {
        let a = a.to_digit(10)?;
        let b = b.to_digit(10)?;

        Some((a*10 + b) as u64)
    };

    chars_to_digit(b1 as char, b2 as char).ok_or(Error)
}

Either way, I'd understand it if you chose to just keep the current code :-D

tailhook commented 2 years ago

Hm, perhaps branch prediction does the magic. And I've just realized that this branch prediction will basically always hit the cache if your dates are expected to be valid.

Anyway I like your latest solution, exception that inner function doesn't have to be lambda. just make fn two_digits_inner or fn _two_digits that is called by two_digits. It's a common pattern to transform error messages this way.

vrmiguel commented 2 years ago

Done!

tailhook commented 2 years ago

Merged. Thanks!