Open devozerov opened 7 months ago
cc @martint
This has been on the agenda to discuss on the Trino Contributor Call a few times now. Ideally @devozerov can join next time and we can discuss more.
@mosabua Sure, I can prepare a couple slides if that helps
That would be good. Next contributor call is on the 26th of Sept.
https://github.com/trinodb/trino/wiki/Contributor-meetings
Ping me for a calendar invite if you can attend and will present.
Motivation
Modern analytical engines use relational operator properties to find optimal plan. Property is a value associated with the operator that doesn't change operator's equivalence and that can be enforced via a special enforcer operator. In distributed systems, there are two common properties:
Exchange
operator.Sort
operator.State-of-the-art optimizers are able to find optimal placement of
Exchange
andSort
operators, possibly enforcing them in various places of the plan. For Exchange, the goal is to find a plan with minimal data movement while still accounting for data skew. For Sort, the goal is to enforce (or propagate) collations to allow for streaming aggregates and merge joins. Both Exchange and Sort placement should be determined via a cost-based optimization.Examples of products that uses state-of-the-art techniques for :
Historically, both Presto and Trino relied mostly on heurstic optimizations. Cost-based optimizations are scattered across the code base, yielding multiple local minima while failing to provide globally optimal plan. Examples are overlapping Exchange placement and data skewness checks in
AddExchanges
, parallelism estimation inDetermineTableScanNodePartitioning
, and limited partitioning propagation mechanics via opaqueConnectorPartitioningHandle
. There is no Sort placement optimizer. As a result neither Presto, nor Trino can sufficiently exploit properties during optimization. However, Presto community started discussion around the nextgen query optimizer, which means that the product most likely will gather mentioned features sooner or later.This puts Trino into vulnerable position, because sophisticated property enforcement in many cases allows other products to find better plans and demonstrate superior performance.
Proposal
This issues proposes to start the discussion about advanced property propagation in Trino. This includes (but not limited to):
ConnectorMetadata.getTableProperties|getCommonPartitioningHandle|makeCompatiblePartitioning
as they all assume static predetermined partitioningAddExchange
alternative based on MEMO and top-down Cascades-like algorithm.Sort
propagation rule (conceptually similar toAddExchange
) that will enable Trino using merge join and streaming aggregations.