apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.13k stars 409 forks source link

[VL] Support Complex Datatype in Parquet Write #6446

Open surnaik opened 1 month ago

surnaik commented 1 month ago

Description

Currently we do a check on data types and not push to velox if certain data type is present like array, map, etc. velox supports these datatypes, i think this can be supported. Any reason I'm not aware of why these types are currently not supported.

cc: @PHILO-HE @ulysses-you

ulysses-you commented 1 month ago

cc @JkSelf

JkSelf commented 1 month ago

Velox parquet write doesn't flatten constant/dictionary encoding with complex type. You can refer here for more discussions.

JkSelf commented 1 month ago

Pending https://github.com/facebookincubator/velox/pull/9406 to fix.