FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
316 stars 136 forks source link

Should validate UTF-8 multi-byte validity for short decode path too #239

Closed cowtowncoder closed 3 years ago

cowtowncoder commented 3 years ago

(note: follow-up to #236)

Looks like "long / slow" decoding path for UTF-8 Strings checks that multi-byte characters do not invalid encoding patterns, as expected (and what JSON parser does), but the quick/short pass (when String value is guaranteed to fit in buffer without bounds checks) does not necessarily similarly verify that -- the first byte is checked as expected, but 2nd - 4th are not. Check should be performed for these cases as well, and we should have basic tests as well.

I also think that since this may uncover existing invalid usage, change should go in 2.13 and not in 2.12 patch: that way we can get bit more testing.

cowtowncoder commented 3 years ago

Methods to check in CBORParser: