A Table Function is a function which returns a relation, as opposed to a scalar function which returns a single value.
A Polymorphic Table Function (PTF) is a Table Function which fulfils at least one of the following conditions:
the row type of the returned table is not known at the time when the function is created
the function takes a table as an argument, whose row type is not known when the function is created
Specifically, the output type of the Polymorphic Table Function may depend on the arbitrary table passed as an argument.
Scope of the task
The scope of the task is to provide full support for Table Functions, including Polymorphic Table Functions.
Subtasks
[ ] Add language support for PTF invocation
[ ] grammar (in review)
[ ] AST representation (in review)
[ ] tests (in review)
[ ] Add SPI support for declaring PTFs by plugins / connectors
[ ] The main interface: ConnectorTableFunction(in review)
[ ] analyze(): a method for required and custom analysis (in review)
[ ] Analysis: a class to provide the required and custom analysis results to Trino Analyzer (in review)
[ ] InvocationHandle: an interface for passing the custom analysis results (in review)
[ ] fulfil(): a method to provide the function logic to Trino
[ ] Classes representing argument declarations, and returned type declaration (in review)
[ ] Classes representing the actual passed arguments (in review)
[ ] Add mechanism for registering PTFs
[ ] Add TableFunctionRegistry with table function resolution (in review)
[ ] Prepare for path resolution when we have it (in review)
[ ] Respect differences between connector-provided and plugin-provided PTFs through separate interfaces (in review)
[ ] Add register - unregister mechanisms (in review)
[ ] Analyze PTF invocation
[ ] scalar arguments (in review)
[ ] DESCRIPTOR arguments (not yet supported)
[ ] TABLE arguments (not yet supported)
[ ] tests
[ ] Plan PTF invocation
[ ] Add a dedicated PlanNode: TableFunctionNode(in review)
[ ] Implement relevant PlanVisitors (in review)
[ ] Explain
[ ] Execute PTF through pushdown to connector
[ ] Add apply() methods (in review)
[ ] Add RewriteTableFunctionToTableScan Optimizer rule (in review)
[ ] Unit test for the rule
[ ] Add example PTF implementations: query pass-through
[ ] remote_query function for JDBC connectors:
[ ] Druid (in review)
[ ] MemSql (in review)
[ ] MySql (in review)
[ ] Oracle (in review)
[ ] PostgreSql (in review)
[ ] Redshift (in review)
[ ] SqlServer (in review)
[ ] remote_query function for Elasticsearch connector (in review)
[ ] tests
Achieved functionality
Currently, any PTF can be supported, which can be entirely realised by a connector. The connector can "capture" the PTF invocation, and replace it with a ConnectorTableHandle, which represents the PTF result.
Following work
To provide full support for PTF, as in SQL standard, we need to:
[ ] Support TABLE arguments (starting from the Analyzer)
[ ] Support DESCRIPTOR arguments (starting from the Analyzer)
[ ] EXPLAIN: think of rendering TABLE arguments, which are both function arguments, and PlanNode sources
[ ] Support optimizations of plans involving TableFunctionNode: column pruning, etc
[ ] Implement COPARTITION as JOIN
[ ] Choose distribution of sources, regarding size, number of sources, and row/set semantics (see: DetermineJoinDistributionType)
[ ] Handle TableFunctionNode in AddExchanges / AddLocalExchanges: realise the partitioning and ordering of input tables respecting their actual properties. Also, consider the output properties of TableFunctionNode.
[ ] Figure out the interfaces between the PTF logic (the fulfil() method), and the Operator: in what form data will be provided to the PTF logic, and in what form the results will be returned to the Operator
[ ] Add the Operator:
[ ] arbitrary number of sources: 0 to n
[ ] final partitioning and sorting of the sources (see: WindowOperator)
[ ] invoking the PTF logic on cartesian product of the partitions from all the sources (see: Nested Loops for implementation choices)
[ ] appending the pass-through columns and the partitioning columns from the source tables
This issue serves the following purposes:
related issue: https://github.com/trinodb/trino/issues/1839 related PR: https://github.com/trinodb/trino/pull/11336
Definitions
A Table Function is a function which returns a relation, as opposed to a scalar function which returns a single value. A Polymorphic Table Function (PTF) is a Table Function which fulfils at least one of the following conditions:
Specifically, the output type of the Polymorphic Table Function may depend on the arbitrary table passed as an argument.
Scope of the task
The scope of the task is to provide full support for Table Functions, including Polymorphic Table Functions.
Subtasks
[ ] Add language support for PTF invocation
[ ] Add SPI support for declaring PTFs by plugins / connectors
ConnectorTableFunction
(in review)analyze()
: a method for required and custom analysis (in review)Analysis
: a class to provide the required and custom analysis results to Trino Analyzer (in review)InvocationHandle
: an interface for passing the custom analysis results (in review)fulfil()
: a method to provide the function logic to Trino[ ] Add mechanism for registering PTFs
[ ] Analyze PTF invocation
[ ] Plan PTF invocation
TableFunctionNode
(in review)[ ] Execute PTF through pushdown to connector
apply()
methods (in review)RewriteTableFunctionToTableScan
Optimizer rule (in review)[ ] Add example PTF implementations: query pass-through
remote_query
function for JDBC connectors:remote_query
function for Elasticsearch connector (in review)Achieved functionality
Currently, any PTF can be supported, which can be entirely realised by a connector. The connector can "capture" the PTF invocation, and replace it with a ConnectorTableHandle, which represents the PTF result.
Following work
To provide full support for PTF, as in SQL standard, we need to:
TableFunctionNode
: column pruning, etcTableFunctionNode
in AddExchanges / AddLocalExchanges: realise the partitioning and ordering of input tables respecting their actual properties. Also, consider the output properties ofTableFunctionNode
.fulfil()
method), and the Operator: in what form data will be provided to the PTF logic, and in what form the results will be returned to the Operator