sharksforarms / deku

Declarative binary reading and writing: bit-level, symmetric, serialization/deserialization
Apache License 2.0
1.14k stars 55 forks source link

Improve performance after `impl-writer` and `impl-reader` #393

Closed wcampbell0x2a closed 6 months ago

wcampbell0x2a commented 11 months ago

This fixes most performance issues where the rustc compiler in different version with different parameters didn't inline several functions, both generated and impl'ed in deku.

See https://github.com/wcampbell0x2a/deku-bench/pull/6 for benchmarks. Speaking of which I will probably add these benchmarks directly into this project at some point.

I actually opened a ticked for rust-lang, since the nightly changes regressed the performance of this crate, until this commit was added: https://github.com/rust-lang/rust/issues/118674

This might removed the need for testing: https://github.com/sharksforarms/deku/issues/358, and discussed with https://github.com/sharksforarms/deku/issues/308#issuecomment-1532593054.

See https://github.com/sharksforarms/deku/issues/25

github-actions[bot] commented 11 months ago

Benchmark for f3b1791

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1273.3±14.54ns | **1150.0±38.49ns** | **-9.68%** | | deku_read_byte | 21.9±0.80ns | **3.6±0.03ns** | **-83.56%** | | deku_read_enum | 9.4±0.15ns | **2.9±0.04ns** | **-69.15%** | | deku_read_vec | 57.6±0.81ns | **35.0±0.63ns** | **-39.24%** | | deku_write_bits | 192.9±3.70ns | 190.4±4.67ns | -1.30% | | deku_write_byte | **20.6±0.40ns** | 21.1±0.19ns | **+2.43%** | | deku_write_enum | 19.6±0.19ns | 19.5±0.19ns | -0.51% | | deku_write_vec | **293.8±1.98ns** | 298.5±2.68ns | **+1.60%** |
github-actions[bot] commented 11 months ago

Benchmark for 86622b5

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1246.1±9.95ns | **1230.8±15.15ns** | **-1.23%** | | deku_read_byte | 21.0±0.37ns | **3.7±0.07ns** | **-82.38%** | | deku_read_enum | 9.5±0.19ns | **2.9±0.08ns** | **-69.47%** | | deku_read_vec | 59.1±1.26ns | **34.7±0.33ns** | **-41.29%** | | deku_write_bits | 194.8±3.47ns | 194.1±3.45ns | -0.36% | | deku_write_byte | 21.9±0.27ns | 21.7±0.59ns | -0.91% | | deku_write_enum | 20.9±0.54ns | **20.2±0.26ns** | **-3.35%** | | deku_write_vec | 296.0±4.59ns | 297.3±5.48ns | +0.44% |
github-actions[bot] commented 11 months ago

Benchmark for 0025238

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | **1224.6±33.99ns** | 1389.9±12.67ns | **+13.50%** | | deku_read_byte | 21.5±1.02ns | **3.6±0.04ns** | **-83.26%** | | deku_read_enum | 9.5±0.11ns | **3.0±0.02ns** | **-68.42%** | | deku_read_vec | 59.2±0.83ns | **36.2±1.25ns** | **-38.85%** | | deku_write_bits | **195.5±3.21ns** | 218.7±6.83ns | **+11.87%** | | deku_write_byte | 21.7±0.44ns | **21.2±0.37ns** | **-2.30%** | | deku_write_enum | 21.8±4.09ns | **19.9±0.62ns** | **-8.72%** | | deku_write_vec | **292.3±2.84ns** | 417.8±7.28ns | **+42.94%** |
github-actions[bot] commented 11 months ago

Benchmark for c5051ae

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1178.5±19.30ns | **1155.4±10.99ns** | **-1.96%** | | deku_read_byte | 20.1±0.31ns | **5.4±0.13ns** | **-73.13%** | | deku_read_enum | 9.5±0.13ns | **2.6±0.07ns** | **-72.63%** | | deku_read_vec | 54.3±0.40ns | **35.9±0.33ns** | **-33.89%** | | deku_write_bits | **178.8±3.34ns** | 184.6±6.63ns | **+3.24%** | | deku_write_byte | 20.9±0.38ns | 21.0±0.40ns | +0.48% | | deku_write_enum | 20.0±0.26ns | 19.8±0.27ns | -1.00% | | deku_write_vec | 299.5±2.60ns | **296.9±3.48ns** | **-0.87%** |
github-actions[bot] commented 11 months ago

Benchmark for 2210022

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | **1139.0±33.04ns** | 1246.3±13.63ns | **+9.42%** | | deku_read_byte | 22.2±0.89ns | **5.3±0.29ns** | **-76.13%** | | deku_read_enum | 9.4±0.13ns | **2.5±0.05ns** | **-73.40%** | | deku_read_vec | 53.6±0.67ns | **35.5±0.94ns** | **-33.77%** | | deku_write_bits | 202.5±6.48ns | **183.1±3.48ns** | **-9.58%** | | deku_write_byte | 20.9±0.35ns | 21.2±0.43ns | +1.44% | | deku_write_enum | 20.2±0.44ns | 20.6±0.80ns | +1.98% | | deku_write_vec | **274.7±4.57ns** | 297.5±4.08ns | **+8.30%** |
github-actions[bot] commented 7 months ago

Benchmark for 9aa4243

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1268.2±11.88ns | **1210.1±13.32ns** | **-4.58%** | | deku_read_byte | 20.9±0.59ns | **5.3±0.08ns** | **-74.64%** | | deku_read_enum | 9.4±0.31ns | **2.6±0.05ns** | **-72.34%** | | deku_read_vec | 58.3±0.92ns | **36.8±0.52ns** | **-36.88%** | | deku_write_bits | 186.0±4.65ns | **180.6±7.88ns** | **-2.90%** | | deku_write_byte | 21.1±0.23ns | **20.8±0.18ns** | **-1.42%** | | deku_write_enum | 20.5±0.30ns | 20.5±0.25ns | 0.00% | | deku_write_vec | 284.8±4.23ns | 283.4±4.64ns | -0.49% |
github-actions[bot] commented 7 months ago

Benchmark for f704391

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1274.8±12.17ns | **1113.5±19.65ns** | **-12.65%** | | deku_read_byte | 20.5±0.49ns | **5.6±0.17ns** | **-72.68%** | | deku_read_enum | 9.2±0.14ns | **2.6±0.06ns** | **-71.74%** | | deku_read_vec | 58.2±0.85ns | **35.8±0.67ns** | **-38.49%** | | deku_write_bits | 184.2±3.21ns | **171.5±2.90ns** | **-6.89%** | | deku_write_byte | **21.2±0.53ns** | 21.7±0.33ns | **+2.36%** | | deku_write_enum | 20.8±0.39ns | **20.2±0.42ns** | **-2.88%** | | deku_write_vec | **283.5±2.60ns** | 291.6±13.47ns | **+2.86%** |
github-actions[bot] commented 7 months ago

Benchmark for b328835

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1225.6±13.55ns | **1209.0±38.00ns** | **-1.35%** | | deku_read_byte | 20.2±0.39ns | **5.2±0.14ns** | **-74.26%** | | deku_read_enum | 9.5±0.22ns | **2.5±0.07ns** | **-73.68%** | | deku_read_vec | 58.7±1.20ns | **36.1±0.34ns** | **-38.50%** | | deku_write_bits | 192.9±5.44ns | **187.2±4.70ns** | **-2.95%** | | deku_write_byte | **20.8±0.53ns** | 22.4±0.66ns | **+7.69%** | | deku_write_enum | **20.2±0.47ns** | 21.1±0.29ns | **+4.46%** | | deku_write_vec | **296.0±4.31ns** | 303.3±5.46ns | **+2.47%** |
github-actions[bot] commented 6 months ago

Benchmark for b6b2884

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1234.2±11.57ns | **1155.3±13.71ns** | **-6.39%** | | deku_read_byte | 20.5±0.34ns | **5.2±0.15ns** | **-74.63%** | | deku_read_enum | 9.3±0.15ns | **2.6±0.06ns** | **-72.04%** | | deku_read_vec | 59.3±0.53ns | **35.3±0.66ns** | **-40.47%** | | deku_write_bits | **183.4±4.56ns** | 199.0±4.46ns | **+8.51%** | | deku_write_byte | **21.4±0.36ns** | 22.1±0.52ns | **+3.27%** | | deku_write_enum | **20.5±0.20ns** | 23.5±0.33ns | **+14.63%** | | deku_write_vec | 388.0±5.45ns | **291.0±3.19ns** | **-25.00%** |
wcampbell0x2a commented 6 months ago

@sharksforarms rebased, should be good now