Open jimsnab opened 2 years ago
cc @rsc
CC @FiloSottile @agl
This is related to #53845, which proposes the opposite, where it locks down the set of allowed characters. We could modify that proposal to instead be something more versatile like:
// WithIgnored specifies a set of non-alphabet characters that are ignored
// when parsing the input. An empty string causes the encoder to reject
// all characters that are not part of the encoding alphabet.
// A newly created Encoder ignores '\r' and '\n' by default.
func (enc Encoding) WithIgnored(chars string) *Encoding
Thus, for my original purposes for #53845, you could do enc.WithIgnored("")
to be in a strict mode that rejects any non-alphabet characters. On the other hand, for the use-cases of PEM, the package could do enc.WithIgnored("\t\v\f \r\n")
based on the list specified in RFC 7468, section 2.
Even though RFC 2045 says:
Any characters outside of the base64 alphabet are to be ignored in base64-encoded data.
IMO, that's a dangerous semantic. If someone were accidentally using the base64 URL alphabet (RFC 4648, section 5.), which uses '-' and '_' instead of '+' and '/', then that would not be detected as an error, but would be silently parsed as valid data. Ignoring whitespace, on the other hand, has a more obviously correct semantic.
Change https://go.dev/cl/532295 mentions this issue: encoding: support WithIgnored in base32 and base64
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What did you do?
Paste valid PEM data in the example above. The whitespace indentation must be included. Typical output:
What did you expect to see?
Take the whitespace out of the PEM and see something like this:
Per RFC 7468 says whitespace SHOULD be ignored. It's not a bug for this module to reject whitespace. But the RFC is absolutely clear that PEMs with whitespace can exist in the world, and is tolerated by some parsers, and I would expect this general purpose module to tolerate the whitespace as well.
Further, RFC 2045 is more explicit and says non-base64 characters MUST be ignored. Since the inner body of the PEM is base64, I would expect whitespace to be ignored by the golang parser.
What did you see instead?
Leading whitespace within the lines of the PEM octets causes the parser to fail.