marshallpierce / rust-base64

base64, in rust
Apache License 2.0
615 stars 115 forks source link

perf(encode): vec alloc & utf8 unchecked #180

Open AaronO opened 2 years ago

AaronO commented 2 years ago

Improves throughput by 10-20%

We avoid zeroing the allocated vec since we'll write into it and we don't need to check utf8 when building the String since all alphabets emit ascii/utf8

Benches

❯ rustup run nightly cargo bench -- '^encode_small_input/encode/'
   Compiling base64 v0.20.0-alpha.1 (/Users/aaron/git/rust-base64)
    Finished bench [optimized + debuginfo] target(s) in 3.43s
     Running unittests (target/release/deps/base64-a91f049e61d3804b)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 105 filtered out; finished in 0.00s

     Running unittests (target/release/deps/benchmarks-0b92500c69ee3caa)
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.

Gnuplot not found, using plotters backend
encode_small_input/encode/3
                        time:   [31.275 ns 31.304 ns 31.337 ns]
                        thrpt:  [91.300 MiB/s 91.394 MiB/s 91.480 MiB/s]
                 change:
                        time:   [-12.333% -12.192% -12.057%] (p = 0.00 < 0.05)
                        thrpt:  [+13.710% +13.884% +14.068%]
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
encode_small_input/encode/50
                        time:   [46.873 ns 46.923 ns 46.977 ns]
                        thrpt:  [1015.0 MiB/s 1016.2 MiB/s 1017.3 MiB/s]
                 change:
                        time:   [-10.129% -9.9667% -9.8193%] (p = 0.00 < 0.05)
                        thrpt:  [+10.889% +11.070% +11.271%]
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe
encode_small_input/encode/100
                        time:   [62.462 ns 62.484 ns 62.520 ns]
                        thrpt:  [1.4896 GiB/s 1.4905 GiB/s 1.4910 GiB/s]
                 change:
                        time:   [-12.490% -12.286% -12.124%] (p = 0.00 < 0.05)
                        thrpt:  [+13.797% +14.007% +14.273%]
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  4 (4.00%) high mild
  9 (9.00%) high severe
encode_small_input/encode/500
                        time:   [214.46 ns 214.56 ns 214.69 ns]
                        thrpt:  [2.1690 GiB/s 2.1703 GiB/s 2.1713 GiB/s]
                 change:
                        time:   [-11.604% -11.382% -11.148%] (p = 0.00 < 0.05)
                        thrpt:  [+12.547% +12.843% +13.128%]
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  4 (4.00%) high mild
  9 (9.00%) high severe
encode_small_input/encode/3072
                        time:   [1.0080 us 1.0097 us 1.0116 us]
                        thrpt:  [2.8282 GiB/s 2.8334 GiB/s 2.8382 GiB/s]
                 change:
                        time:   [-16.244% -16.081% -15.926%] (p = 0.00 < 0.05)
                        thrpt:  [+18.942% +19.163% +19.394%]
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild