Open asfimport opened 5 years ago
Joris Van den Bossche / @jorisvandenbossche:
I have JSON data where the columnar (line-delimited) part is in a
data
subkey:
Note that the data
subpart is not line delimited, but a comma-delimited JSON array. So that's a first thing that would be good to support.
Some additional resources that might be useful: in pandas there are many formats supported, called "orients", see the overview table at http://pandas.pydata.org/pandas-docs/version/0.24/user_guide/io.html#reading-json (disclaimer: I don't know how common the different formats are, so it doesn't necessarily makes sense to copy them all from pandas).
One of the formats is the JSON Table Schema (https://frictionlessdata.io/specs/table-schema/), which is a json file with a 'metadata'
and 'data'
top-level keys, where the 'data'
then consists of comma-delimited records (so very similar in structure as what @dhirschfeld showed above).
I have JSON data where the columnar (line-delimited) part is in a
data
subkey:It would be good if the arrow JSON parser could allow specifying where the columnar data is stored.
Since the
metadata
is also important to me it would be even better if the rest of the JSON could be returned as a Python dict with the only the specified keys parsed as arrow tables - e.g.Reporter: Dave Hirschfeld / @dhirschfeld
Note: This issue was originally created as ARROW-5568. Please see the migration documentation for further details.