marshallpierce / rust-base64

base64, in rust
Apache License 2.0
606 stars 113 forks source link

InvalidLastSymbol - Valid Base64? #193

Closed inzanez closed 2 years ago

inzanez commented 2 years ago

Hi

I do get an error message that says InvalidLastSymbol(25, 87) for this (valid?!) base64: UG9ydGFsZSBIYWNraW5nVGVhbW==. Is there something I am missing here?

marshallpierce commented 2 years ago

Have you confirmed that for the alphabet you're using that https://docs.rs/base64/0.13.0/base64/enum.DecodeError.html#variant.InvalidLastSymbol is not applicable? W in common alphabets would be 22 decimal or 0b00010110, which has too many bits set to be a valid second symbol in a 2-symbol trailing quad. Some buggy base64 encoders produce this sort of invalid symbol in the last position.

inzanez commented 2 years ago

@marshallpierce not quite sure if I can follow you. I got that string from an email header (a Microsoft msg based email header), which is an RFC 2047 representation of non-ASCII characters in an email header: =?utf-8?B?UG9ydGFsZSBIYWNraW5nVGVhbW==?=

Following RFC 2047, the part UG9ydGFsZSBIYWNraW5nVGVhbW== is the encoded string. And after not being able to decode that I tried using Linux board tools like base64 -d, validators like this https://base64.guru/tools/validator and as none of them reported an error I was under the impression that it must be valid base64 then,...:-) Which might not be the case I guess,...

inzanez commented 2 years ago

@marshallpierce ok, digging into the code,...I see what you mean! Thanks, that solves the issue :-)

marshallpierce commented 2 years ago

echo -n 'UG9ydGFsZSBIYWNraW5nVGVhbW==' | base64 -d | cargo run --example base64 -- produces UG9ydGFsZSBIYWNraW5nVGVhbQ==. Note that trailing Q -- 6 characters before W, which would therefore remove that pesky trailing 0110. The base64 command line tool does not check for invalid last symbols.