marshallpierce / rust-base64

base64, in rust
Apache License 2.0
606 stars 113 forks source link

perf(decode): optimize Vec alloc/resize #179

Open AaronO opened 2 years ago

AaronO commented 2 years ago

Avoid calloc overhead of initializing buffer we'll write into, improving decode_small_input/decode/3 by -33%

Unfortunately requires unsafe Vec::set_len so we can get a mutable ref to the uninit portion of the Vec's buffer

Before

decode_small_input/decode/3
                        time:   [56.722 ns 56.776 ns 56.833 ns]
                        thrpt:  [50.341 MiB/s 50.391 MiB/s 50.439 MiB/s]
Found 18 outliers among 100 measurements (18.00%)
  2 (2.00%) high mild
  16 (16.00%) high severe

After

decode_small_input/decode/3
                        time:   [37.232 ns 38.653 ns 41.313 ns]
                        thrpt:  [69.252 MiB/s 74.017 MiB/s 76.843 MiB/s]
                 change:
                        time:   [-34.739% -33.757% -32.200%] (p = 0.00 < 0.05)
                        thrpt:  [+47.492% +50.959% +53.231%]
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high severe
AaronO commented 2 years ago

FYI this also fixes a "bug" in decode_engine where we over alloc the buffer's size:

let mut buffer = Vec::<u8>::with_capacity(input.as_ref().len() * 4 / 3);

This approximately allocates with a ratio of 4/3 instead of 3/4 (without accounting for padding, etc...)

marshallpierce commented 2 years ago

Thanks for this work. However, I'm not sure who the target user would be -- if you're ok with unsafe, the forthcoming AVX2 version will surely outperform it, and if you're not, you'd probably want the existing safe version. 🤔 Good catch on the over-allocation, too!

Nugine commented 2 years ago

Hi. There is another crate base64-simd for performance. It is highly unsafe but much faster than base64 crate.

marshallpierce commented 2 years ago

I have some in progress work to address https://github.com/marshallpierce/rust-base64/issues/182, after which https://github.com/marshallpierce/rust-base64/pull/170 will be addressed.