kindly / flatterer

Opinionated JSON to CSV/XLSX/SQLITE/PARQUET converter. Flattens JSON fast.
https://flatterer.opendata.coop
MIT License
182 stars 7 forks source link

RuntimeError if using ndjson file that contains blank lines #35

Closed jpmckinney closed 1 year ago

jpmckinney commented 1 year ago

There is some discussion here about which line-delimited JSON formats allow blank (or whitespace only) lines (https://github.com/ndjson/ndjson-spec/issues/8), but at any rate it would be nice to skip the lines.

jpmckinney commented 1 year ago

A check like if !string.as_bytes().iter().all(u8::is_ascii_whitespace) { is probably fastest.

kindly commented 1 year ago

Ideally trying and parse the JSON, and on failure, do the check, as that would mean only a slowdown on cases where there were blank lines. I will investigate that.

kindly commented 1 year ago

This is now done: https://github.com/kindly/libflatterer/commit/c85cc8b5383bdcbbe629fa4d4952f488366a7cbf

If parsing failes it checks for empty line and ignores it if it is. Will be in next release.