Looks like "long / slow" decoding path for UTF-8 Strings checks that multi-byte characters do not invalid encoding patterns, as expected (and what JSON parser does), but the quick/short pass (when String value is guaranteed to fit in buffer without bounds checks) does not necessarily similarly verify that -- the first byte is checked as expected, but 2nd - 4th are not. Check should be performed for these cases as well, and we should have basic tests as well.
I also think that since this may uncover existing invalid usage, change should go in 2.13 and not in 2.12 patch: that way we can get bit more testing.
(note: follow-up to #236)
Looks like "long / slow" decoding path for UTF-8 Strings checks that multi-byte characters do not invalid encoding patterns, as expected (and what JSON parser does), but the quick/short pass (when String value is guaranteed to fit in buffer without bounds checks) does not necessarily similarly verify that -- the first byte is checked as expected, but 2nd - 4th are not. Check should be performed for these cases as well, and we should have basic tests as well.
I also think that since this may uncover existing invalid usage, change should go in 2.13 and not in 2.12 patch: that way we can get bit more testing.