dotnet / dotNext

Next generation API for .NET
https://dotnet.github.io/dotNext/
MIT License
1.58k stars 118 forks source link

Base64Decoder does not detect aligned internal padding #169

Closed alxu-avpt closed 1 year ago

alxu-avpt commented 1 year ago
var decoder = new DotNext.Buffers.Text.Base64Decoder();
decoder.DecodeFromUtf16("AA==");
decoder.DecodeFromUtf16("AA==");

Expected behavior: an exception is thrown

Actual behavior: no exception is thrown

The best solution is probably to create member bool gotPadding or similar, set it if buffer[^1] == '=', and throw exception if gotPadding is set when starting decoding. Also related, it's possible to save some memory using forbidden states in reservedBuffer, examples:

  1. find first null byte/char instead of storing buffer size separately (null bytes are illegal in base64 input anyways)
  2. use -1 to mean gotPadding = true (end of input stream)
  3. create separate Utf8Base64Decoder and Utf16Base64Decoder structs holding 4 bytes and 8 bytes buffer
sakno commented 1 year ago

Correct, but Base64Decoder is designed for best performance. It's expected that the input stream is a valid base64 string. I understand that A==A== is not a valid base64 string.

Checking that buffer[^1] == '=' is not enough. The padding can be in the middle of the string. Thus, I have to check the presence of = character for each call of DecodeFromUtf16, which is O(n) operation that slows down the decoding of the stream.

sakno commented 1 year ago

Also, the following code is valid as well:

var decoder = new DotNext.Buffers.Text.Base64Decoder();
var owner = decoder.DecodeFromUtf16("AA=");
owner.Dispose();
Assert.True(decoder.NeedMoreData);

owner = decoder.DecodeFromUtf16("=");
owner.Dispose();
sakno commented 1 year ago

Anyway, I'll try to implement this in performant way.

sakno commented 1 year ago

Done, by the cost of extra branching in some methods. I think this is acceptable overhead. Now it's working both for Utf8 and Utf16 encodings.

alxu-avpt commented 1 year ago

Thanks!

sakno commented 1 year ago

Fixed in 4.12.2 and published.