apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.49k stars 3.52k forks source link

[Gandiva][UDF] Support complex datatype for UDF return type. #24588

Open asfimport opened 4 years ago

asfimport commented 4 years ago

Is it possible to return a complex datatype for a UDF, like vector or event dictionary? Checked https://github.com/apache/arrow/blob/master/cpp/src/gandiva/precompiled/types.h and found the types used there are all basic datatypes. 

Reporter: ZMZ91 / @ZMZ91

Note: This issue was originally created as ARROW-8405. Please see the migration documentation for further details.

asfimport commented 4 years ago

Pindikura Ravindra / @pravindra: gandiva doesn't support complex types yet. 

 

  1. For output, the following will need to be fixed ** Allocating output vector for project ** populating output vector in codegen
  2. For input, the following will need to be fixed ** loading entry from input vector in codegen. This is currently implemented as a visitor but the visitor only supports primitive types.

 

It will be easier to add support for primitive fields inside complex types to begin with (eg. integer field inside a struct type).

asfimport commented 4 years ago

ZMZ91 / @ZMZ91: Thanks @pravindra.

asfimport commented 4 years ago

ZMZ91 / @ZMZ91: Hi @pravindra, besides the codes you pointed out, I see almost all gandiva udfs are implemented inside extern C. So if I add a function returning a map<string, string>, it would be incompatible in extern C. Is it doable to write a udf outside extern C?