Closed wcampbell0x2a closed 2 months ago
For reference, not as perf as it was ;)
+ critcmp stable nightly stable-perf nightly-perf
group nightly nightly-perf stable stable-perf
----- ------- ------------ ------ -----------
Deserialize/binrw 1.38 2.8±0.06ms ? ?/sec 1.00 2.0±0.04ms ? ?/sec 1.37 2.8±0.10ms ? ?/sec 1.01 2.1±0.07ms ? ?/sec
Deserialize/custom 1.21 2.2±0.06ms ? ?/sec 1.00 1826.9±30.32µs ? ?/sec 1.21 2.2±0.07ms ? ?/sec 1.03 1873.1±56.09µs ? ?/sec
Deserialize/deku 1.11 3.1±0.08ms ? ?/sec 1.00 2.8±0.05ms ? ?/sec 1.05 3.0±0.06ms ? ?/sec 1.01 2.8±0.13ms ? ?/sec
Serialize/binrw 2.09 5.5±0.21ms ? ?/sec 1.00 2.7±0.06ms ? ?/sec 2.19 5.8±0.21ms ? ?/sec 1.00 2.7±0.10ms ? ?/sec
Serialize/custom 1.11 1943.4±49.78µs ? ?/sec 1.02 1790.1±137.87µs ? ?/sec 1.14 2.0±0.04ms ? ?/sec 1.00 1752.8±44.16µs ? ?/sec
Serialize/deku 1.12 3.1±0.17ms ? ?/sec 1.00 2.8±0.04ms ? ?/sec 1.10 3.1±0.08ms ? ?/sec 1.01 2.8±0.14ms ? ?/sec
before:
Hmmm, my inline(never) hurt the performance of the count vs read_all. I might have to balance this..
read_all_bytes time: [10.605 µs 10.623 µs 10.649 µs]
read_all time: [10.715 µs 10.727 µs 10.741 µs]
count time: [2.5363 µs 2.5426 µs 2.5526 µs]
Hmmm, my inline(never) hurt the performance of the count vs read_all. I might have to balance this..
read_all_bytes time: [10.605 µs 10.623 µs 10.649 µs] read_all time: [10.715 µs 10.727 µs 10.741 µs] count time: [2.5363 µs 2.5426 µs 2.5526 µs]
Or! Just take the improvements and go to sleep ;)
Add Leftover::{Byte, Bits}, so that instead of conversion straight away into a Bitvec, we keep it as a slice instead. In the case of the read_all attribute this improves the speed, as reading until EOF doesn't keep causing our reader to convert to bits and back again all the time.
This does result in some slowdown, but with some #[inline(never)] we can keep it to a minimum. The total gain to read_all speed is worth any .8 ms slowdown I saw in testing.
Closes #439