Closed zamicol closed 1 year ago
Go has an issue for this that is currently frozen (and I might re-open if this doesn't ping 'em)
Fixed by 98e7068.
Also just discovered a 2022 paper on the issue: "Base64 Malleability in Practice" https://dl.acm.org/doi/10.1145/3488932.3527284
Playground demonstrating the issue:
There's an apparent problem with RFC 4648. There are three places base 64 representation may contain string variation:
What is "canonical encoding"? From the last three characters of the example
tmb
,"cLj8vs...XNuhOk"
, the valueshOk
andhOl
may both decode to the same byte value (in Hex,84E9
) even though they are different UTF-8 values. (Example decoding hOk and hOl.) The canonical encoding ishOk
The RFC specifically addresses 1 and 2, but not really 3.
RFC 4648 advises to reject non-alphabet characters, which can include padding. I agree with this advice:
I don't see the RFC really address the to the third concern.
Behavior
Obviously non-"strict"/non-canonical base 64 encoding is incorrect, and any encoder producing non-strict encoding should be fixed. However the question is what should Coze specify regarding non-strict encoding/decoding? Both Go and Javascript are permissive when decoding and do not throw errors.
Ultimately, the concern is different base 64 encoders/decoders may have different behavior. Ideally, Coze should specify the appropriate behavior for Coze. Section 3.5 mentions non-canonical encoding in the context of unpadded data and this issues is unrelated to padding (
hOk=
andhOl=
, both padded, have the same issue as unpadded strings).The concern is that if a Coze implementation used string comparison instead of byte comparison, this could result implementations disagreeing about valid messages. For example, with a non-strict
tmb
encoded string, if a Coze implementation checkstmb
before cryptographic verification, it may check this based on the string value or the byte value, and comparing the string value or the byte value will result in different behavior.Another note for any Coze restriction on encoding: JSON is base 64 unaware, any sort of Coze specified enforcement of base 64 encoding can only be applied to Coze known fields with type b64ut, and cannot be applied generally to any b64ut value.
Solutions
There appears to be only two options to handle this:
2 is more conservative, but may require unnecessary checks that don't really add value. 1 has the potential to be more compatible if assuming that systems can decode permissively (other programming language's base 64 libraries decode permissively), which may be a bad assumption.
Regardless, I believe that 1 is the correct behavior here. Even if languages/system do no error on non-canonical encoding, implementing an encoding error can be implemented by re-encoding the decoded data and comparing strings.
Security Considerations
This base 64 decoding bug doesn't appear to be a structural/architectural/security concern since Coze uses the UTF-8 encoding of the string for signing and verification, however it is a interesting problem that should be known when working with RFC base 64. Concerning specifically replay attacks, signatures are still not malleable as payloads are UTF-8 encoded and the signing operation is not base 64 aware.
If Coze used the base 64 representation directly, this would be a security concern and could result in reply attacks.
Notes
It should be obvious, but this situation also applies to the URI unsafe alphabet and messages with base 64 padding, which all are interpreted as the same bytes. (My conversion tool only has "base64 as an input and not the various permutations since all variations can be known (or is irrelevant) and results in the sames decoded binary payload.
RFC 4648
I currently have errata open on one of the relevant sections.
I'm going to implement a non-canonical encoding check on Go and JS Coze.
See also the Go base64 package.
Go's base64 ignores carriage return and new line, so it is malleable, but JSON unmarshal does not, making Go Coze non-malleable. https://go.dev/play/p/X0J74F0zWVf See also the new line test in
base64_test.go