sharksforarms / deku

Declarative binary reading and writing: bit-level, symmetric, serialization/deserialization
Apache License 2.0
1.05k stars 54 forks source link

Improve performance after `impl-writer` and `impl-reader` #393

Closed wcampbell0x2a closed 2 months ago

wcampbell0x2a commented 6 months ago

This fixes most performance issues where the rustc compiler in different version with different parameters didn't inline several functions, both generated and impl'ed in deku.

See https://github.com/wcampbell0x2a/deku-bench/pull/6 for benchmarks. Speaking of which I will probably add these benchmarks directly into this project at some point.

I actually opened a ticked for rust-lang, since the nightly changes regressed the performance of this crate, until this commit was added: https://github.com/rust-lang/rust/issues/118674

This might removed the need for testing: https://github.com/sharksforarms/deku/issues/358, and discussed with https://github.com/sharksforarms/deku/issues/308#issuecomment-1532593054.

See https://github.com/sharksforarms/deku/issues/25

github-actions[bot] commented 6 months ago

Benchmark for f3b1791

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1273.3±14.54ns | **1150.0±38.49ns** | **-9.68%** | | deku_read_byte | 21.9±0.80ns | **3.6±0.03ns** | **-83.56%** | | deku_read_enum | 9.4±0.15ns | **2.9±0.04ns** | **-69.15%** | | deku_read_vec | 57.6±0.81ns | **35.0±0.63ns** | **-39.24%** | | deku_write_bits | 192.9±3.70ns | 190.4±4.67ns | -1.30% | | deku_write_byte | **20.6±0.40ns** | 21.1±0.19ns | **+2.43%** | | deku_write_enum | 19.6±0.19ns | 19.5±0.19ns | -0.51% | | deku_write_vec | **293.8±1.98ns** | 298.5±2.68ns | **+1.60%** |
github-actions[bot] commented 6 months ago

Benchmark for 86622b5

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1246.1±9.95ns | **1230.8±15.15ns** | **-1.23%** | | deku_read_byte | 21.0±0.37ns | **3.7±0.07ns** | **-82.38%** | | deku_read_enum | 9.5±0.19ns | **2.9±0.08ns** | **-69.47%** | | deku_read_vec | 59.1±1.26ns | **34.7±0.33ns** | **-41.29%** | | deku_write_bits | 194.8±3.47ns | 194.1±3.45ns | -0.36% | | deku_write_byte | 21.9±0.27ns | 21.7±0.59ns | -0.91% | | deku_write_enum | 20.9±0.54ns | **20.2±0.26ns** | **-3.35%** | | deku_write_vec | 296.0±4.59ns | 297.3±5.48ns | +0.44% |
github-actions[bot] commented 6 months ago

Benchmark for 0025238

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | **1224.6±33.99ns** | 1389.9±12.67ns | **+13.50%** | | deku_read_byte | 21.5±1.02ns | **3.6±0.04ns** | **-83.26%** | | deku_read_enum | 9.5±0.11ns | **3.0±0.02ns** | **-68.42%** | | deku_read_vec | 59.2±0.83ns | **36.2±1.25ns** | **-38.85%** | | deku_write_bits | **195.5±3.21ns** | 218.7±6.83ns | **+11.87%** | | deku_write_byte | 21.7±0.44ns | **21.2±0.37ns** | **-2.30%** | | deku_write_enum | 21.8±4.09ns | **19.9±0.62ns** | **-8.72%** | | deku_write_vec | **292.3±2.84ns** | 417.8±7.28ns | **+42.94%** |
github-actions[bot] commented 6 months ago

Benchmark for c5051ae

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1178.5±19.30ns | **1155.4±10.99ns** | **-1.96%** | | deku_read_byte | 20.1±0.31ns | **5.4±0.13ns** | **-73.13%** | | deku_read_enum | 9.5±0.13ns | **2.6±0.07ns** | **-72.63%** | | deku_read_vec | 54.3±0.40ns | **35.9±0.33ns** | **-33.89%** | | deku_write_bits | **178.8±3.34ns** | 184.6±6.63ns | **+3.24%** | | deku_write_byte | 20.9±0.38ns | 21.0±0.40ns | +0.48% | | deku_write_enum | 20.0±0.26ns | 19.8±0.27ns | -1.00% | | deku_write_vec | 299.5±2.60ns | **296.9±3.48ns** | **-0.87%** |
github-actions[bot] commented 6 months ago

Benchmark for 2210022

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | **1139.0±33.04ns** | 1246.3±13.63ns | **+9.42%** | | deku_read_byte | 22.2±0.89ns | **5.3±0.29ns** | **-76.13%** | | deku_read_enum | 9.4±0.13ns | **2.5±0.05ns** | **-73.40%** | | deku_read_vec | 53.6±0.67ns | **35.5±0.94ns** | **-33.77%** | | deku_write_bits | 202.5±6.48ns | **183.1±3.48ns** | **-9.58%** | | deku_write_byte | 20.9±0.35ns | 21.2±0.43ns | +1.44% | | deku_write_enum | 20.2±0.44ns | 20.6±0.80ns | +1.98% | | deku_write_vec | **274.7±4.57ns** | 297.5±4.08ns | **+8.30%** |
github-actions[bot] commented 2 months ago

Benchmark for 9aa4243

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1268.2±11.88ns | **1210.1±13.32ns** | **-4.58%** | | deku_read_byte | 20.9±0.59ns | **5.3±0.08ns** | **-74.64%** | | deku_read_enum | 9.4±0.31ns | **2.6±0.05ns** | **-72.34%** | | deku_read_vec | 58.3±0.92ns | **36.8±0.52ns** | **-36.88%** | | deku_write_bits | 186.0±4.65ns | **180.6±7.88ns** | **-2.90%** | | deku_write_byte | 21.1±0.23ns | **20.8±0.18ns** | **-1.42%** | | deku_write_enum | 20.5±0.30ns | 20.5±0.25ns | 0.00% | | deku_write_vec | 284.8±4.23ns | 283.4±4.64ns | -0.49% |
github-actions[bot] commented 2 months ago

Benchmark for f704391

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1274.8±12.17ns | **1113.5±19.65ns** | **-12.65%** | | deku_read_byte | 20.5±0.49ns | **5.6±0.17ns** | **-72.68%** | | deku_read_enum | 9.2±0.14ns | **2.6±0.06ns** | **-71.74%** | | deku_read_vec | 58.2±0.85ns | **35.8±0.67ns** | **-38.49%** | | deku_write_bits | 184.2±3.21ns | **171.5±2.90ns** | **-6.89%** | | deku_write_byte | **21.2±0.53ns** | 21.7±0.33ns | **+2.36%** | | deku_write_enum | 20.8±0.39ns | **20.2±0.42ns** | **-2.88%** | | deku_write_vec | **283.5±2.60ns** | 291.6±13.47ns | **+2.86%** |
github-actions[bot] commented 2 months ago

Benchmark for b328835

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1225.6±13.55ns | **1209.0±38.00ns** | **-1.35%** | | deku_read_byte | 20.2±0.39ns | **5.2±0.14ns** | **-74.26%** | | deku_read_enum | 9.5±0.22ns | **2.5±0.07ns** | **-73.68%** | | deku_read_vec | 58.7±1.20ns | **36.1±0.34ns** | **-38.50%** | | deku_write_bits | 192.9±5.44ns | **187.2±4.70ns** | **-2.95%** | | deku_write_byte | **20.8±0.53ns** | 22.4±0.66ns | **+7.69%** | | deku_write_enum | **20.2±0.47ns** | 21.1±0.29ns | **+4.46%** | | deku_write_vec | **296.0±4.31ns** | 303.3±5.46ns | **+2.47%** |
github-actions[bot] commented 2 months ago

Benchmark for b6b2884

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | deku_read_bits | 1234.2±11.57ns | **1155.3±13.71ns** | **-6.39%** | | deku_read_byte | 20.5±0.34ns | **5.2±0.15ns** | **-74.63%** | | deku_read_enum | 9.3±0.15ns | **2.6±0.06ns** | **-72.04%** | | deku_read_vec | 59.3±0.53ns | **35.3±0.66ns** | **-40.47%** | | deku_write_bits | **183.4±4.56ns** | 199.0±4.46ns | **+8.51%** | | deku_write_byte | **21.4±0.36ns** | 22.1±0.52ns | **+3.27%** | | deku_write_enum | **20.5±0.20ns** | 23.5±0.33ns | **+14.63%** | | deku_write_vec | 388.0±5.45ns | **291.0±3.19ns** | **-25.00%** |
wcampbell0x2a commented 2 months ago

@sharksforarms rebased, should be good now