apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.48k stars 1.01k forks source link

Support Multi-Column Producing Function Implementation #2343

Open Ted-Jiang opened 2 years ago

Ted-Jiang commented 2 years ago

I think we will need to introduce a new type of function to cover inline since it is not a scalar function.

I don't have a good name for it yet but I think we need something like this?

pub type MultiColumnProducingFunctionImplementation =
    Arc<dyn Fn(&[ColumnarValue]) -> Result<Vec<ColumnarValue>> + Send + Sync>;

Originally posted by @andygrove in https://github.com/apache/arrow-datafusion/issues/2330#issuecomment-1109238122

Ted-Jiang commented 2 years ago

I thinks we can call it TableFunctionImplementation https://docs.snowflake.com/en/sql-reference/functions-table.html

alamb commented 2 years ago

I think a multiple column table function is different than a "table function"

A table function returns (potentially) multiple columns and rows where as I think this ticket is just talking about returning multiple columns.

The natural thing in my mind for this usecase would be to return a single column of DataType::Struct (or DataType::List)

However DataFusion's support for such compound datatypes is in need of work -- see https://github.com/apache/arrow-datafusion/issues/2326