wardi / jsonlines

Documentation for the JSON Lines text file format
http://jsonlines.org
139 stars 34 forks source link

A clarification is required on BOM use #95

Closed Guba666 closed 2 weeks ago

Guba666 commented 2 weeks ago

Dear You all, good day. I'm asking for a serious clarification that made me waste days in headaches. The clarification is this: does the standard of JSONL require the BOM definition at its very beginning? I'm dealing with a poorly documented API and it seems that while the file to contain the requests must be encoded in UTF-8, the BOM preamble must not be included. Infact my ultimate test (I swaer I tried so many combinations that you cold define me the "madman of the API") has been to create a new UTF8Encoding without the BOM preamble, the create a file that has been accepted by the API in question. SO, now, before to go to pull the ears of these guys I need to know if the standard JSONL - guessing there's one - requires the BOM premable or it doesn't. Unfortunately - I admit my ignorance - I wasn't able to find any documentation about that. The website of JSON do not include any documentation on the JSONL format. Can someone help me in finding the truth? THANKS IN ADVANCE for any help you will provide.

wardi commented 2 weeks ago

JSONL has no requirement for a BOM. JSONL should only be encoded in UTF-8 so a BOM doesn't really make sense.

Guba666 commented 2 weeks ago

OK. BUT.... if a JSONL file is created by a common object like those automated that require a UTF-8 encoding, the BOM is being added automatically. I'm using Delphi and a statement like this StreamWriter := TStreamWriter.Create(FileStream, TEncoding.UTF8); adds automatically the BOM preamble. Moreover, a JSONL file in UTF-16 is such a nonsense?

Guba666 commented 2 weeks ago

Not to say that using a TStreamWriter is done via a command that states StreamWriter.Write(Content); SO, if it must be clear to someone somewhere that a BOM preabmble souldn't be included OR the stadard read/write routines must not include BOM for the standard JSONL format.

wardi commented 2 weeks ago

https://datatracker.ietf.org/doc/html/rfc8259#section-8.1

JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8 Implementations MUST NOT add a byte order mark (U+FEFF) to the beginning of a networked-transmitted JSON text

wardi commented 2 weeks ago

I'm happy to accept a PR that makes it clear that BOMs must not be included in JSONL files.

Guba666 commented 2 weeks ago

Dear Wardi, sign me in for not including any BOM. But please do convince someone in charge fo this, to include this specification into some documentation readable by anyone. Thanks for your support.