sheredom / json.h

🗄️ single header json parser for C and C++
The Unlicense
693 stars 77 forks source link

Parsing and serialization of control characters #103

Open j-moeller opened 1 month ago

j-moeller commented 1 month ago

Hello,

we found json.h to parse and serialize control characters below 0x20 which technically is in violation of the JSON grammar. We collected a minimum working sample here.

sheredom commented 1 month ago

Cannot see the sample (it 404's for me). Can you point me at the offending JSON grammar language that the lib is violating by any chance? Happy to have this fixed, but just wanna know where it says!

j-moeller commented 1 month ago

Sorry, the repository was still set to "private". It should be public now.

I am referencing Section 7 "Strings" from (https://datatracker.ietf.org/doc/html/rfc8259):

All Unicode characters may be placed within the quotation marks, except for the characters that MUST be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).

If my understanding of this is correct, the control characters below U+001F must be passed as "\u0000" - "\u001f" to be valid JSON (except for U+0008, U+000C, U+000A, U+000D, U+0009 which may also be passed as "\b", "\f", "\n", "\r", "\t").

json.h behaves as follows:

These are expected to return parsing errors:

These are expected to be parsed and return "\u00xx":

Note that since there is also Section 9 "Parser", json.h is technically still adhering to the specification. So feel free to decide on the correct way to handle this.

A JSON parser MAY accept non-JSON forms or extensions.

sheredom commented 1 month ago

Nice summary thanks! I think we'll fix this - seems worthwhile to err on the side of caution here.

I can take this change up if you wish, but happy to accept a PR if you'd rather do the coding!

j-moeller commented 1 month ago

Hi, sorry for the late reply. Unfortunately, I am not that familiar with the code base, so I think it would be better, if you implemented the necessary changes.

sheredom commented 1 month ago

Totally fine. When I get the time I'll look into it.