apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.29k stars 973 forks source link

Stop copying LogicalPlan and Exprs in `PushDownLimit` #10508

Closed alamb closed 2 weeks ago

alamb commented 2 weeks ago

~Draft as it builds on https://github.com/apache/datafusion/pull/10501~

Which issue does this PR close?

Closes https://github.com/apache/datafusion/issues/10292

Rationale for this change

Make optimizer faster by not copying as much

What changes are included in this PR?

  1. Rewrite PushDownLimit to avoid deep cloning LogicalPlans / Exprs

Are these changes tested?

Are there any user-facing changes?

No functional changes and very minor (if any) performance improvements

Performance runs Details

``` ++ critcmp main pushdown_limit group main pushdown_limit ----- ---- -------------- logical_aggregate_with_join 1.00 1000.4±16.43µs ? ?/sec 1.00 996.1±9.36µs ? ?/sec logical_plan_tpcds_all 1.00 152.0±1.82ms ? ?/sec 1.00 151.8±1.44ms ? ?/sec logical_plan_tpch_all 1.00 16.7±0.20ms ? ?/sec 1.00 16.7±0.21ms ? ?/sec logical_select_all_from_1000 1.00 18.8±0.08ms ? ?/sec 1.00 18.9±0.11ms ? ?/sec logical_select_one_from_700 1.01 815.2±13.25µs ? ?/sec 1.00 809.3±6.15µs ? ?/sec logical_trivial_join_high_numbered_columns 1.00 766.1±17.98µs ? ?/sec 1.00 764.6±12.09µs ? ?/sec logical_trivial_join_low_numbered_columns 1.00 753.1±19.23µs ? ?/sec 1.00 750.1±9.83µs ? ?/sec physical_plan_tpcds_all 1.00 1277.7±9.55ms ? ?/sec 1.00 1273.9±7.81ms ? ?/sec physical_plan_tpch_all 1.02 87.9±1.97ms ? ?/sec 1.00 85.9±1.55ms ? ?/sec physical_plan_tpch_q1 1.00 4.6±0.06ms ? ?/sec 1.01 4.6±0.06ms ? ?/sec physical_plan_tpch_q10 1.03 4.3±0.07ms ? ?/sec 1.00 4.1±0.08ms ? ?/sec physical_plan_tpch_q11 1.00 3.7±0.07ms ? ?/sec 1.01 3.7±0.06ms ? ?/sec physical_plan_tpch_q12 1.00 2.7±0.05ms ? ?/sec 1.00 2.7±0.05ms ? ?/sec physical_plan_tpch_q13 1.00 2.0±0.03ms ? ?/sec 1.00 2.0±0.03ms ? ?/sec physical_plan_tpch_q14 1.00 2.4±0.04ms ? ?/sec 1.00 2.4±0.06ms ? ?/sec physical_plan_tpch_q16 1.01 3.5±0.08ms ? ?/sec 1.00 3.5±0.05ms ? ?/sec physical_plan_tpch_q17 1.00 3.4±0.05ms ? ?/sec 1.00 3.3±0.06ms ? ?/sec physical_plan_tpch_q18 1.02 3.9±0.08ms ? ?/sec 1.00 3.8±0.07ms ? ?/sec physical_plan_tpch_q19 1.00 5.6±0.09ms ? ?/sec 1.00 5.6±0.08ms ? ?/sec physical_plan_tpch_q2 1.00 7.4±0.10ms ? ?/sec 1.02 7.5±0.06ms ? ?/sec physical_plan_tpch_q20 1.01 4.4±0.06ms ? ?/sec 1.00 4.3±0.08ms ? ?/sec physical_plan_tpch_q21 1.02 6.1±0.11ms ? ?/sec 1.00 5.9±0.11ms ? ?/sec physical_plan_tpch_q22 1.01 3.2±0.07ms ? ?/sec 1.00 3.2±0.06ms ? ?/sec physical_plan_tpch_q3 1.00 3.0±0.06ms ? ?/sec 1.00 3.0±0.06ms ? ?/sec physical_plan_tpch_q4 1.00 2.2±0.04ms ? ?/sec 1.00 2.2±0.05ms ? ?/sec physical_plan_tpch_q5 1.00 4.3±0.08ms ? ?/sec 1.01 4.4±0.07ms ? ?/sec physical_plan_tpch_q6 1.00 1452.2±18.35µs ? ?/sec 1.01 1466.0±21.56µs ? ?/sec physical_plan_tpch_q7 1.00 5.5±0.10ms ? ?/sec 1.00 5.5±0.08ms ? ?/sec physical_plan_tpch_q8 1.00 7.1±0.09ms ? ?/sec 1.00 7.0±0.11ms ? ?/sec physical_plan_tpch_q9 1.00 5.3±0.09ms ? ?/sec 1.01 5.4±0.09ms ? ?/sec physical_select_all_from_1000 1.00 61.5±0.32ms ? ?/sec 1.00 61.7±0.36ms ? ?/sec physical_select_one_from_700 1.00 3.6±0.04ms ? ?/sec 1.00 3.6±0.03ms ? ?/sec ```

alamb commented 2 weeks ago

Thank you so much for the reviews @comphead -- really appreciated