Open kszlim opened 2 months ago
I think implementing the equivalent of https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.flatten
For RecordBatch
Makes sense to me
If implemented similar to json normalize you could take in a max depth option, this would make it strictly more powerful/flexible than pyarrow.Table.flatten.
Hi, do you all mind if I give this a shot?
Hi, do you all mind if I give this a shot?
Go ahead!
take
Is your feature request related to a problem or challenge? Please describe what you are trying to do. I want to write flattened parquet files, as not everything has support for structs.
Describe the solution you'd like Recursively flatten all struct columns in a recordbatch (similar to pandas json normalize), alternatively, a solution via datafusion might be acceptable.
Describe alternatives you've considered Running pyarrow.Table.flatten in a loop until there are no more top level struct columns, though this requires you to go through python.