Open alamb opened 2 years ago
This https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/src/physical_plan/file_format/mod.rs#L238 is one reason of errors related to column projection. It compares the complete enum, failing on different field order.
Arrow has a method to compare data types (https://github.com/apache/arrow-rs/blob/master/arrow/src/datatypes/datatype.rs#L674). I think this method should me made public, and used in above.
Currently datafusion uses match_field_names (default true), https://github.com/apache/arrow-rs/blob/master/arrow/src/record_batch.rs#L153 causing the error.
Thanks for the investigation @nl5887 -- that sounds definitely plausible. Feel free to file a PR with proposed changed -- we would love to review them
This one is also related: https://github.com/apache/arrow-datafusion/issues/2581
Reminder to write docs: #1222
Potential to add to list #7012
We are starting to make progress on struct support --
There is a PR up to support named_struct
https://github.com/apache/arrow-datafusion/pull/9743 and work afoot to support nicer literal syntax: https://github.com/apache/arrow-datafusion/issues/9820 🚀
Hi, i think unnest support for struct can be an item in this epic right?
Hi, i think unnest support for struct can be an item in this epic right?
That would make sense to me -- is there a ticket that describes what this means?
i created a ticket: https://github.com/apache/datafusion/issues/10264
i created a ticket: #10264
Thank you. I added this to the list in the ticket description
I added an issue to support recursive unnest: https://github.com/apache/datafusion/issues/10660, i think it shoul belong to this epic
I added an issue to support recursive unnest: #10660, i think it shoul belong to this epic
Added
Is your feature request related to a problem or challenge? Please describe what you are trying to do. This ticket is designed to capture the work needed to properly support Arrow
Struct
types in DataFusionhttps://arrow.apache.org/datafusion/user-guide/sql/sql_status.html says that nested types are not supported; The are not fully supported, but there are parts of the support already present such as a way to serialize them via ArrowWriter and using
field["nested_field"]
syntaxDescribe the solution you'd like Research, and describe / implement what is else remains for proper support.
Array (
ListArray
) support:Map (
MapArray
) support:Struct (
StructArray
) support:10207
11204
Union (
UnionArray
) support10206
Other
Known issues so far: