sharksforarms / deku

Declarative binary reading and writing: bit-level, symmetric, serialization/deserialization
Apache License 2.0
1.14k stars 55 forks source link

Recommend LTO(Link Time Optimizations) #280

Closed wcampbell0x2a closed 1 year ago

wcampbell0x2a commented 2 years ago

This might also be a bitvec recommendation, but given these benchmarks we should really recommend lto in the README or something. This is with the following patch compared to current master. (Noting I run all my tests while on rustc nightly)

diff --git a/Cargo.toml b/Cargo.toml
index fbefde8..ed00a70 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -42,3 +42,6 @@ env_logger = "0.9.1"
 [[bench]]
 name = "deku"
 harness = false
+
+[profile.release]
+lto = true
Gnuplot not found, using plotters backend
deku_read_byte          time:   [1.4496 ns 1.4571 ns 1.4652 ns]
                        change: [-78.813% -78.219% -77.635%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

deku_write_byte         time:   [41.514 ns 41.964 ns 42.414 ns]
                        change: [-22.263% -19.044% -15.503%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe

deku_read_bits          time:   [336.28 ns 338.97 ns 341.88 ns]
                        change: [-8.2648% -5.2847% -1.6368%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

deku_write_bits         time:   [60.781 ns 61.478 ns 62.208 ns]
                        change: [-27.084% -25.352% -23.645%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

deku_read_enum          time:   [1.8785 ns 1.8869 ns 1.8960 ns]
                        change: [-84.189% -83.834% -83.484%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe

deku_write_enum         time:   [66.102 ns 66.853 ns 67.643 ns]
                        change: [-22.483% -21.024% -19.367%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

deku_read_vec           time:   [146.30 ns 147.58 ns 148.89 ns]
                        change: [-76.352% -75.668% -74.914%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high severe

deku_write_vec          time:   [2.7450 µs 2.7648 µs 2.7900 µs]
                        change: [-25.393% -24.450% -23.505%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

deku_read_vec_perf      time:   [459.91 ns 465.09 ns 470.44 ns]
                        change: [-16.819% -13.968% -11.321%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

deku_write_vec_perf     time:   [2.7465 µs 2.7667 µs 2.7865 µs]
                        change: [-32.089% -31.224% -30.365%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe
wcampbell0x2a commented 2 years ago

LTO might make bitvec 1.0 worse according to my benchmarks