Eventual-Inc / Daft

Distributed data engine for Python/SQL designed for the cloud, powered by Rust
https://getdaft.io
Apache License 2.0
2.34k stars 164 forks source link

[FEAT] Support for correlated subqueries in SQL (not yet executable) #3304

Closed kevinzwang closed 2 days ago

kevinzwang commented 6 days ago

This PR adds support for converting SQL queries with correlated subqueries into LogicalPlans. It does not add the ability to execute queries with correlated subqueries, but if I am correct, this is the last large piece of support we need on the SQL side for TPC-H questions, and most of the remaining work is plan rewriting, optimization, and translation.

I believe with the new alias_map value in SQLPlanner, we can actually simplify a lot of the logic in plan_aggregate_query and plan_non_agg_query but I will not attempt to do that in this PR.

Relevant for TPC-H questions 4, 17, 20, 21, 22.

Todo:

codspeed-hq[bot] commented 6 days ago

CodSpeed Performance Report

Merging #3304 will degrade performances by 20.38%

Comparing kevin/correlated-subqueries (17f815a) with main (1b84250)

Summary

❌ 1 regressions ✅ 16 untouched benchmarks

:warning: Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main kevin/correlated-subqueries Change
test_iter_rows_first_row[100 Small Files] 312.1 ms 392 ms -20.38%
kevinzwang commented 6 days ago

I think I broke sql_expr, will try to fix that tomorrow

codecov[bot] commented 3 days ago

Codecov Report

Attention: Patch coverage is 78.26087% with 35 lines in your changes missing coverage. Please review.

Project coverage is 77.42%. Comparing base (84db665) to head (79d8b03). Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-sql/src/planner.rs 87.60% 15 Missing :warning:
src/daft-dsl/src/expr/mod.rs 16.66% 10 Missing :warning:
src/daft-table/src/lib.rs 70.00% 6 Missing :warning:
src/daft-schema/src/schema.rs 0.00% 3 Missing :warning:
src/daft-logical-plan/src/partitioning.rs 0.00% 1 Missing :warning:
Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304/graphs/tree.svg?width=650&height=150&src=pr&token=J430QVFE89&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) ```diff @@ Coverage Diff @@ ## main #3304 +/- ## ========================================== - Coverage 77.55% 77.42% -0.14% ========================================== Files 668 676 +8 Lines 82268 82660 +392 ========================================== + Hits 63807 64003 +196 - Misses 18461 18657 +196 ``` | [Files with missing lines](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | Coverage Δ | | |---|---|---| | [src/daft-dsl/src/lib.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&filepath=src%2Fdaft-dsl%2Fsrc%2Flib.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtZHNsL3NyYy9saWIucnM=) | `100.00% <ø> (ø)` | | | [src/daft-dsl/src/optimization.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&filepath=src%2Fdaft-dsl%2Fsrc%2Foptimization.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtZHNsL3NyYy9vcHRpbWl6YXRpb24ucnM=) | `98.11% <100.00%> (ø)` | | | [src/daft-logical-plan/src/ops/project.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&filepath=src%2Fdaft-logical-plan%2Fsrc%2Fops%2Fproject.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9naWNhbC1wbGFuL3NyYy9vcHMvcHJvamVjdC5ycw==) | `62.97% <100.00%> (ø)` | | | [src/daft-sql/src/functions.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&filepath=src%2Fdaft-sql%2Fsrc%2Ffunctions.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtc3FsL3NyYy9mdW5jdGlvbnMucnM=) | `80.39% <ø> (ø)` | | | [src/daft-sql/src/lib.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&filepath=src%2Fdaft-sql%2Fsrc%2Flib.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtc3FsL3NyYy9saWIucnM=) | `100.00% <100.00%> (ø)` | | | [src/daft-sql/src/table\_provider/mod.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&filepath=src%2Fdaft-sql%2Fsrc%2Ftable_provider%2Fmod.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtc3FsL3NyYy90YWJsZV9wcm92aWRlci9tb2QucnM=) | `53.70% <ø> (ø)` | | | [src/daft-logical-plan/src/partitioning.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&filepath=src%2Fdaft-logical-plan%2Fsrc%2Fpartitioning.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9naWNhbC1wbGFuL3NyYy9wYXJ0aXRpb25pbmcucnM=) | `44.28% <0.00%> (ø)` | | | [src/daft-schema/src/schema.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&filepath=src%2Fdaft-schema%2Fsrc%2Fschema.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtc2NoZW1hL3NyYy9zY2hlbWEucnM=) | `91.09% <0.00%> (-1.12%)` | :arrow_down: | | [src/daft-table/src/lib.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&filepath=src%2Fdaft-table%2Fsrc%2Flib.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtdGFibGUvc3JjL2xpYi5ycw==) | `84.26% <70.00%> (-0.38%)` | :arrow_down: | | [src/daft-dsl/src/expr/mod.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree&filepath=src%2Fdaft-dsl%2Fsrc%2Fexpr%2Fmod.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtZHNsL3NyYy9leHByL21vZC5ycw==) | `71.88% <16.66%> (+1.49%)` | :arrow_up: | | ... and [1 more](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | | ... and [17 files with indirect coverage changes](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3304/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)

🚨 Try these New Features: