Open dsnet opened 2 years ago
An alternative and more flexible API is (per https://github.com/golang/go/issues/54054#issuecomment-1194998778):
// WithIgnored specifies a set of non-alphabet characters that are ignored
// when parsing the input. An empty string causes the encoder to reject
// all characters that are not part of the encoding alphabet.
// A newly created Encoder ignores '\r' and '\n' by default.
func (enc Encoding) WithIgnored(chars string) *Encoding
My original proposal would be equivalent to enc.WithIgnored("")
,
while #54054 could be accomplished using enc.WithIgnored("\t\v\f \r\n")
.
Change https://go.dev/cl/532295 mentions this issue: encoding: support WithIgnored in base32 and base64
This feature combined with #53844 makes it possible to implement a truly bijective mapping between baseXX and binary data. This would allow the use of base32
and base64
to produce a truly canonical encoding per RFC 4648, section 3.5.
golang/protobuf#1626 arose because the "google.golang.org/protobuf/encoding/protojson" package implicitly allowed newlines and carriage returns because the default behavior of the "base64" package is to ignore such characters. Having this option to begin with would have avoided that problem.
I like the idea of WithIgnored
as all sorts of whitespaces are common when dealing with various base64 uses. This option also gives the greatest flexibility to allow for strict nothing else but valid characters, the current newline/carriage return only ignored, and more flexible permissive whitespace ignoring in general.
Currently,
base32
andbase64
ignore carriage returns and linefeeds by default.This behavior goes against RFC 4648, sections 3.3 which state:
Rejection of "characters outside the base encoding alphabet" (including carriage returns and line feeds) should be the default, unless specified otherwise by some higher-level specification (e.g., MIME). The decision to allow
\r
or\n
should not have been made by thebase32
andbase64
packages, but rather by the users of it.Today,
base32
andbase64
already ignore\r
and\n
by default and we can't change that, but we should expose control over this behavior: