Closed bruchar1 closed 2 years ago
The return value of base64_decode
should be independent of the data being processed (obviously). If the data is decoded successfully, then base64_decode
should always return 1
. If what you state is correct, then this is a pretty big bug.
~What susprises me is that roundtrip encoding and decoding of 0x00
is part of the test suite.~ Edit: that part of the test suite does not include the null terminator in its test.
I also cannot reproduce this bug with the following command line:
echo -ne 'hello\0world' | bin/base64 | bin/base64 -d >/dev/null; echo $?
This prints 0
(success) as it should.
So either this is an obscure bug hiding in one of the optimized SIMD decoders, or something else is going on.
Which codec are you using? Does the test suite run correctly on your machine?
Can you post any code to reproduce the issue?
I added a very quick and dirty test to do roundtrip encoding on data with \0x00
in it. All tests in CI pass.
I figured it out. My allocated buffer for encoding was not big enough, and it caused the last padding =
to be missing.
I guess the readme is missleading:
The buffer in out has been allocated by the caller and is at least 4/3 the size of the input.
The right buffer size is probably (n + 2) / 3 * 4
in integer arithmetic.
TBH I think it would make sense to add two macros or inline functions to do buffer size calculations. As an added bonus it'd be way more readable than to repeat the size calculation everywhere.
it caused the last padding
=
to be missing.
While we are at it, I think a flag to encode/decode without padding would make sense, too. See also RFC4648 section 3.2.
Agreed that it makes sense for the library to expose a macro to do the buffer size calculation.
I've been hesitant to add one (or to publish the "ultimate buffer sizing formula" in the sense of the post above) because I'm not positive that all encoders will always stay within the bounds of such a buffer. The SIMD encoders write 16 bytes of output at a time.
Then again, those SIMD encoders always operate on a full, guaranteed 12 bytes of input, so their output should always fit into the output buffer if it's sized according to something like the formula above.
Regarding the suggestion of adding a flag to disable padding: the suggestion makes sense, but I'm wary of adding more behavioral flags at this point. I'd first like to do a comprehensive rewrite to sort out various pain points, and rethink the architecture of this library first.
I created #116 to discuss the addition of size macros.
It seems that if I encode then decode data containing
<NUL>
characters, the data is correctly encoded and decoded, butbase64_decode
returns 0 instead of 1. Is that the expected behavior?