apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.15k stars 1.16k forks source link

Scalar::List does not encapsulate all information from ListArray #351

Open jorgecarleitao opened 3 years ago

jorgecarleitao commented 3 years ago

IMO the roundtrip ListArray -> Scalar::List -> ListArray is currently lossy. This happens because the Scalar::List does not encapsulate everything from the ListArray.

Examples:

(new) suggested signature:

Scalar::List(Option<Arc<dyn Array>>, DataType)

With this, the second argument stores the original datatype, which allows to recover all the information from ListArray, and the first argument is easy to recover and build in both directions:

jorgecarleitao commented 3 years ago

@andygrove , I was trying to address this in ballista, but I am struggling to encapsulate a ScalarValue that depends on an Array; how are we declaring arrow arrays in the protobuf? I can't find any reference to them.