ugorji / go

idiomatic codec and rpc lib for msgpack, cbor, json, etc. msgpack.org[Go]
MIT License
1.83k stars 294 forks source link

Decoding invalid MsgPack strings is not detected. #370

Closed schmidtw closed 2 years ago

schmidtw commented 2 years ago

We ran into an issue today with codec 1.2.6 (also checked against 1.2.7) where invalid MsgPack with a string contains non-utf8 characters passes decoding.

Here is the relevant bit showing the invalid string that can be passed:

invalid := []byte{ 0xac /* \xed\xbf\xbf is invalid */, 0xed, 0xbf, 0xbf, 't', '-', 'a', 'd', 'd', 'r', 'e', 's', 's' }

When decoded, everything succeeds, but when utf8.ValidString() is called on the resulting string, the string is not utf8.

I'm not sure how to write up a test/example to be more helpful but wanted to bring up the potential issue.

ugorji commented 2 years ago

We do not validate unicode.

However, that seems like a worthy feature to add. The workaround is onerous i.e. walk through whole decoded value to check.

Let me work on this.

schmidtw commented 2 years ago

Thank you!