xd009642 / llvm-profparser

Mostly complete pure rust implementation of parsing llvm instrumentation profile data
Apache License 2.0
13 stars 9 forks source link

Some mapping perf improvements #40

Closed xd009642 closed 5 months ago

xd009642 commented 5 months ago

So this is based off of #31 but started from me trying faster hash impls and then looking at the perf stuff.

Before looking at benchmarks it should be noted that merging benchmark includes parsing 3 files then merging them. So regressions on parsing that lead to improvements in merging will give bigger wins merging more files in theory.

Mapping everything from #31 at overall on my machine leads to:

Benchmarking profdata_parse_cargo: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.4s, or reduce sample count to 90.
profdata_parse_cargo    time:   [56.461 ms 59.235 ms 62.159 ms]                                 
                        change: [+89.300% +98.456% +108.23%] (p = 0.00 < 0.05)
                        Performance has regressed.

     Running benches/profraw_parsing.rs (target/release/deps/profraw_parsing-d2ba2874943c5b80)
Benchmarking profraw_parse_tokio: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 12.6s, or reduce sample count to 30.
profraw_parse_tokio     time:   [122.50 ms 123.64 ms 125.00 ms]                                
                        change: [+61.104% +62.905% +64.905%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

Benchmarking profraw_parse_cargo: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 20.6s, or reduce sample count to 20.
profraw_parse_cargo     time:   [204.42 ms 205.40 ms 206.42 ms]                                
                        change: [+71.475% +72.466% +73.454%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

     Running benches/report_merging.rs (target/release/deps/report_merging-a98d9cc0c54a27cc)
Benchmarking merge: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 59.0s, or reduce sample count to 10.
merge                   time:   [587.89 ms 596.82 ms 607.83 ms]                  
                        change: [-7.9902% -6.3601% -4.8047%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  4 (4.00%) high mild
  9 (9.00%) high severe

However, mapping only the names (this PR) leads to:

profdata_parse_cargo    time:   [38.257 ms 38.602 ms 38.979 ms]                                 
                        change: [+27.767% +29.437% +31.189%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild

     Running benches/profraw_parsing.rs (target/release/deps/profraw_parsing-7b87cea106e2e42d)
Benchmarking profraw_parse_tokio: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.7s, or reduce sample count to 40.
profraw_parse_tokio     time:   [102.81 ms 103.61 ms 104.45 ms]                                
                        change: [+39.154% +40.425% +41.779%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

Benchmarking profraw_parse_cargo: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 17.6s, or reduce sample count to 20.
profraw_parse_cargo     time:   [177.35 ms 177.90 ms 178.49 ms]                                
                        change: [+49.936% +50.516% +51.114%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 29 outliers among 100 measurements (29.00%)
  17 (17.00%) low mild
  7 (7.00%) high mild
  5 (5.00%) high severe

     Running benches/report_merging.rs (target/release/deps/report_merging-551ba769a8696bda)
Benchmarking merge: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 50.7s, or reduce sample count to 10.
merge                   time:   [506.64 ms 508.77 ms 510.99 ms]                  
                        change: [-21.465% -21.135% -20.771%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild