Open wumpus opened 5 years ago
Good catch. While the examples and most implementations use base32 (which doesn't include "/") the padding character for base32 is also "=" so it's indeed a problem there too.
@wumpus, so that we can turn this issue into a change proposal for WARC 1.2 is there a better definition for digest-value
you'd like to propose?
https://tools.ietf.org/html/rfc4648 is kind of hand-waving but the union of all of the recommended schemes is
A-Za-z0-9/+-_=
Percent encoding is mentioned once and ~.
are mentioned but are argued against, so it's not clear if they are allowed or not. It's as if the RFC was written to be non-normative.
This is also a 1.0/1.1 erratum, not just a proposal for the future.
This issue should be labeled with the "WARC/1.1-possible-errata" label @ato
Ah yes, good point
1.0 and 1.1 specify
and
digest-value
is atoken
. "/" and "=" are not valid characters for a token. "/" is in the usual base64 encoding, and "=" is commonly used for padding.