marshallpierce / rust-base64

base64, in rust
Apache License 2.0
615 stars 115 forks source link

Simplify chunk_encoder #218

Closed mina86 closed 1 year ago

mina86 commented 1 year ago

Firstly, use [T]::chunks rather than doing index calculation manually.

Secondly, get rid of max_input_length. The function was overly complicated for no reason. After all, no matter whether engine uses padding or not, N-byte output buffer can accommodate (N/4*3) input bytes.

Encode benchmarks here are kind of all over. Some improvements and some regressions:

encode/
           3   29.272 ns   -0.4%   No change
          50   54.162 ns  +12.4%   Regression
         100   95.488 ns   -7.8%   Improvement
         500   255.84 ns   +3.5%   Regression
        3072   1.3527 µs   +0.1%   No change
     3145728   1.4584 ms   +0.2%   Within noise
    10485760   4.8923 ms   +1.6%   Regression
    31457280   16.087 ms   -2.6%   Improvement

encode_display/
           3   41.038 ns   -6.9%   Improvement
          50   70.877 ns   +7.2%   Regression
         100   91.190 ns   +1.1%   Within noise
         500   264.58 ns   -3.1%   Improvement
        3072   1.4515 µs   -3.8%   Improvement
     3145728   1.4239 ms   -0.2%   Within noise
    10485760   4.9768 ms   -0.7%   Within noise
    31457280   18.202 ms   -1.5%   Improvement

encode_reuse_buf/
           3   20.264 ns  -23.1%   Improvement
          50   46.631 ns   +2.1%   Regression
         100   66.405 ns   -2.0%   Improvement
         500   243.52 ns   +3.5%   Regression
        3072   1.3871 µs   -1.2%   Within noise
     3145728   1.4278 ms   +0.8%   Within noise
    10485760   5.0530 ms   -1.6%   Improvement
    31457280   15.798 ms   +0.7%   No change

encode_reuse_buf_stream/
           3   21.007 ns   +1.1%   Regression
          50   56.542 ns   -0.9%   No change
         100   75.214 ns   -1.4%   Improvement
         500   247.61 ns   -4.2%   Improvement
        3072   1.2836 µs   +0.1%   No change
     3145728   1.3177 ms   -0.5%   Within noise
    10485760   4.3843 ms   +0.3%   Within noise
    31457280   14.058 ms   -0.5%   Within noise

encode_slice/
           3   11.859 ns   +0.1%   No change
          50   30.239 ns   +2.8%   Regression
         100   53.383 ns   -6.4%   Improvement
         500   207.90 ns   +0.2%   No change
        3072   1.2050 µs   -0.1%   Within noise
     3145728   1.2294 ms   -0.5%   Within noise
    10485760   4.1025 ms   -0.1%   No change
    31457280   12.352 ms   +0.1%   Within noise

encode_string_reuse_buf_stream/
           3   39.237 ns   -0.9%   Within noise
          50   100.15 ns   -1.6%   Improvement
         100   124.03 ns   +3.5%   Regression
         500   312.15 ns  +10.0%   Regression
        3072   1.4781 µs   +2.7%   Regression
     3145728   1.4942 ms   +3.8%   Regression
    10485760   4.9965 ms   +4.3%   Regression
    31457280   16.990 ms   +7.4%   Regression

encode_string_stream/
           3   48.437 ns   -4.2%   Improvement
          50   152.06 ns   +3.5%   Regression
         100   170.02 ns   -5.2%   Improvement
         500   324.46 ns   -0.8%   Within noise
        3072   1.5332 µs   +1.6%   Regression
     3145728   1.4947 ms   +3.6%   Regression
    10485760   4.9710 ms   +3.2%   Regression
    31457280   19.075 ms   +2.2%   Regression
marshallpierce commented 1 year ago

Thanks for the contribution! Fortunately I didn't see quite as much benchmark variation, and the presence of changes in, say, encode_slice that don't touch this logic would imply that there's a fair amount of baseline noise (which my systems have as well). I've simplified it a little in #224.