turicas / rows

A common, beautiful interface to tabular data, no matter the format
GNU Lesser General Public License v3.0
869 stars 134 forks source link

JSON lines format support #202

Open stummjr opened 8 years ago

stummjr commented 8 years ago

JSON-lines files consist of JSON objects separated by new lines. For example:

$ cat file.jsonl
{"msg": "hello world", "lang": "en"}
{"msg": "ola mundo", "lang": "pt"}
{"msg": "bla bla", "lang": "bla"}

This format is quite handy because it's easily "appendable", i.e., you can just append new records in the file without breaking the format:

$ echo anotherfile.jsonl >> file.jsonl
$ cat file.jsonl
{"msg": "hello world", "lang": "en"}
{"msg": "ola mundo", "lang": "pt"}
{"msg": "bla bla", "lang": "bla"}
{"msg": "meow", "lang": "cat"}

That's quite different from the JSON format, where you typically have a list of JSON objects and if you just append something on it using shell redirections, you're gonna break the JSON object. For example:

$ cat file.json
[
    {"msg": "hello world", "lang": "en"},
    {"msg": "ola mundo", "lang": "pt"},
    {"msg": "bla bla", "lang": "bla"}
]

Now, if you append another JSON list on it, the file will contain an invalid JSON object:

$ echo anotherfile.json >> file.json
$ cat file.json
[
    {"msg": "hello world", "lang": "en"},
    {"msg": "ola mundo", "lang": "pt"},
    {"msg": "bla bla", "lang": "bla"}
]
[
    {"msg": "meow", "lang": "cat"}
]

JSON lines is also one of the output formats of the Scrapy project and it's popular between its users.

Thoughts on adding a JSON lines plugin to rows?

(I already started doing it, but I have no time to do it today, so feel free to tackle it if you want. 😃)

turicas commented 7 years ago

Sorry, only today I noticed I didn't answer your question. I think it would be a GREAT idea to support jsonl! Could you implement it?

flaviocpontes commented 6 years ago

I started this issue at pyse, but failed to finish it then. I'll hack on it. JSON lines is the default format of TinyDB too.