prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
15.89k stars 5.32k forks source link

Pushdown projects into value node #23196

Closed kaikalur closed 1 week ago

kaikalur commented 1 month ago

When someone projects an expression from values node, we should be able to inline and simplify those into the value node itself. Today we don't do that:

presto:di> explain select substr(x, 1,2) from (values 'abcd', 'efgh') AS T(x);
                                                      Query Plan
----------------------------------------------------------------------------------------------------------------------
 - Output[PlanNodeId 8][_col0] => [substr:varchar(4)]
         Estimates: {source: CostBasedSourceInfo, rows: 2 (110B), cpu: 220.00, memory: 0.00, network: 0.00}
         _col0 := substr (1:16)
     - Project[PlanNodeId 4][projectLocality = LOCAL] => [substr:varchar(4)]
             Estimates: {source: CostBasedSourceInfo, rows: 2 (110B), cpu: 220.00, memory: 0.00, network: 0.00}
             substr := substr(field, BIGINT'1', BIGINT'2') (1:38)
         - LocalExchange[PlanNodeId 195][ROUND_ROBIN] () => [field:varchar(4)]
                 Estimates: {source: CostBasedSourceInfo, rows: 2 (110B), cpu: 110.00, memory: 0.00, network: 0.00}
             - Values[PlanNodeId 0] => [field:varchar(4)]
                     Estimates: {source: CostBasedSourceInfo, rows: 2 (110B), cpu: 0.00, memory: 0.00, network: 0.00}
                     (VARCHAR'abcd')
                     (VARCHAR'efgh')

(1 row)

So we should pushdown the project nodes into the values clause and simplfiy the expression so that other rules can take advantage of this. Rouhgly we want to make the above to be:

select x from (values  substr('abcd',1,2),  substr('efgh',1,2)) AS T(x)

CC: @feilong-liu

kaikalur commented 1 month ago

CC: @jackychen718