manojkarthick / pqrs

Command line tool for inspecting Parquet files
Apache License 2.0
294 stars 29 forks source link

Add support for nested fields in CSV by encoding them as JSON #54

Open ttencate opened 2 months ago

ttencate commented 2 months ago

Great tool, glad I found it, because I almost started writing something like this myself!

One thing I ran into, though... nested types can't be exported to CSV.

Error: ArrowReadWriteError(CsvError("Nested type List(Field { name: \"item\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }) is not supported in CSV"))

Sometimes, we just don't care about the exact format, or don't even care about this particular column, and just want to load the dang thing into a spreadsheet. Encoding non-primitive column types as JSON helps to accomplish just that, and also happens to be non-ambiguous and therefore possibly even useful.