denoland / fastwebsockets

A fast RFC6455 WebSocket implementation
https://docs.rs/fastwebsockets/
Apache License 2.0
881 stars 66 forks source link

perf: use plain iterator for payload unmasking #81

Closed lucab closed 5 months ago

lucab commented 5 months ago

This tweaks unmask_easy logic in order to directly use an enumerating iterator. It allows the compiler to generate more optimal machine code by performing internal iteration directly on the input slice.

lucab commented 5 months ago

On my Linux laptop with AMD Ryzen 7, this shows a small but reliable micro-benchmark improvement:

Finished release [optimized] target(s) in 16.03s
Running benches/unmask.rs (target/release/deps/unmask-c8256fdf566f5dbf)

unmask2/unmask 64 << 20 time:   [2.3568 ms 2.3596 ms 2.3641 ms]
                        thrpt:  [26.437 GiB/s 26.488 GiB/s 26.519 GiB/s]
                 change:
                        time:   [-8.4860% -7.2746% -6.0839%] (p = 0.00 < 0.05)
                        thrpt:  [+6.4780% +7.8453% +9.2729%]
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe