aklomp / base64

Fast Base64 stream encoder/decoder in C99, with SIMD acceleration
BSD 2-Clause "Simplified" License
865 stars 162 forks source link

bug: out-of-bounds read when using inline assembly code path #111

Closed mayeut closed 1 year ago

mayeut commented 1 year ago

While upgrading base64 in a project I'm working on, valgrind started to report out-of-bounds read in the optimized encoding functions. Those are now using the inline assembly code path instead of the intrinsics code path.

A draft PR with a reproducer will follow (I do not plan to fix this though).

aklomp commented 1 year ago

This is probably due to choosing the wrong cutoff value for the source length. I'll look into it later. Thanks for reporting.

aklomp commented 1 year ago

Found the bug. It's in this calculation. Note how it differs (mistakenly) from the equivalent calculation in the non-asm code.

When calculating the number of rounds that the SIMD encoder can complete, we need to adjust for the fact that rounds consume 12 bytes of inputs, but the read size is 16 bytes. That's why we should subtract four bytes from the input size to ensure that a 16-byte read will not extend past the input buffer.

I'll fix and merge this after I review your pull request. The changes to CI in the pull request should validate the fix.

aklomp commented 1 year ago

The bug occurred because the first encoder to be converted to inline asm was the AVX2 encoder. That codec treats the first round specially by reading it at an offset of +0 bytes, while all following rounds are read at an offset of -4 bytes. This led me to believe that subtracting the four bytes only applied to that codec, and not to AVX and SSSE3, which is why I left it out from those codecs. I've verified locally that that fixes the valgrind warnings.