Closed soywod closed 2 years ago
Base64 standard alphabet is only ASCII a-zA-Z0-9/+
with =
for padding. Any other bytes are invalid. That web site is probably just ignoring errors rather than reporting them. \u{200c}
is the "zero width joiner" character. That's not valid base64, and they must be removed if you want to base64 decode. Also, it's invalid to have 4 padding characters (only =
or==
are valid). Whatever is generating that base64 is doing some pretty weird stuff...
Thank you for your reply, I understand better. I use your lib for a RFC2047 decoder (used by an email client), and one user reported me this parsing error (it comes from an email subject). Email domain is a wild jungle…
Here the encoded string:
When I try to decode it with this function:
I get the error
Err(InvalidByte(126, 61))
and I cannot determine why.Online tools like https://www.base64decode.org/ seems to be able to decode the string:
And if I remove manually the 2 last
==
from the string I can also decode it with your lib. Any idea of what is going on with this string?PS: I noticed that the string is strangely built, it looks like it contains unicodes (could be the cause?). Rust prints it this way: