wardi / jsonlines

Documentation for the JSON Lines text file format
http://jsonlines.org
136 stars 33 forks source link

If newline is escaped to \n in normal JSON how would you escape it in JSON lines? #35

Open matthewkrieger opened 5 years ago

matthewkrieger commented 5 years ago

Since newline is escaped as \n in json strings how do you escape it in json lines format where the newline character (\n or \r\n) is the delimiter between objects?

wardi commented 5 years ago

This might need to be clarified in the docs. The separator between lines in json lines is the single UTF-8 byte '\n' i.e. a UNIX-style newline character chr(10) not a literal backslash-n sequence.

Newlines within json strings are already escaped backslash-n sequences so they don't conflict and whitespace within each json object may not include any UNIX-style newline characters chr(10) or they would conflict with the line separators.

Is that clear? Can you suggest a change to the wording that makes sense?

matthewkrieger commented 5 years ago

Thanks for responding so quickly. One clarification before I propose some alternate wording - in item 3 at http://jsonlines.org you said, "...'\r\n' is also supported...". In JSON lines format would this be two UTF-8 bytes, a chr(13) followed by a chr(10)?

wardi commented 5 years ago

yes. that was another way of saying as a side effect of the other rules "you can use MSDOS-style newline characters as well"

matthewkrieger commented 5 years ago

Ok then it all makes total sense and is very clear. With respect to a possible change in wording I might simply add another sentence to item 3 at http://jsonlines.org saying, "Note that the line separator is not represented as the string literal '\n' or '\r\n' but rather is represented with the single byte UTF-8 Unix-style chr(10) character for '\n' or, in the case of '\r\n' the two byte sequence of UTF-8 chr(10) immediately followed by a UTF-8 chr(13). This ensures there is no conflict with the standard escaping for reserved json characters.