Eventual-Inc / Daft

Distributed data engine for Python/SQL designed for the cloud, powered by Rust
https://getdaft.io
Apache License 2.0
2.35k stars 166 forks source link

[CHORE] Split logical and physical plans into separate crates #3239

Closed kevinzwang closed 2 weeks ago

kevinzwang commented 2 weeks ago

This is the first in a three part series to modify our dependency graph to allow for subqueries:

  1. split logical and physical plan into separate crates
  2. invert dependency between daft-scan and daft-logical-plan
  3. merge daft-logical-plan, daft-function, and daft-dsl crates

I put the physical plan into the existing daft-physical-plan crate, and moved the local physical plan into a daft-local-plan crate.

codspeed-hq[bot] commented 2 weeks ago

CodSpeed Performance Report

Merging #3239 will not alter performance

Comparing kevin/split-logical-physical-plans (6276e8f) with main (61b6842)

Summary

✅ 17 untouched benchmarks

codecov[bot] commented 2 weeks ago

Codecov Report

Attention: Patch coverage is 65.60000% with 43 lines in your changes missing coverage. Please review.

Project coverage is 78.01%. Comparing base (61b6842) to head (6276e8f). Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-physical-plan/src/display.rs 0.00% 38 Missing :warning:
src/daft-logical-plan/src/treenode.rs 88.88% 2 Missing :warning:
...n/src/optimization/rules/reorder_partition_keys.rs 0.00% 2 Missing :warning:
src/daft-logical-plan/src/builder.rs 96.15% 1 Missing :warning:
Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239/graphs/tree.svg?width=650&height=150&src=pr&token=J430QVFE89&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) ```diff @@ Coverage Diff @@ ## main #3239 +/- ## ========================================== - Coverage 79.13% 78.01% -1.13% ========================================== Files 637 640 +3 Lines 77993 79205 +1212 ========================================== + Hits 61719 61789 +70 - Misses 16274 17416 +1142 ``` | [Files with missing lines](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | Coverage Δ | | |---|---|---| | [.../src/intermediate\_ops/anti\_semi\_hash\_join\_probe.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&filepath=src%2Fdaft-local-execution%2Fsrc%2Fintermediate_ops%2Fanti_semi_hash_join_probe.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtZXhlY3V0aW9uL3NyYy9pbnRlcm1lZGlhdGVfb3BzL2FudGlfc2VtaV9oYXNoX2pvaW5fcHJvYmUucnM=) | `97.22% <ø> (ø)` | | | [src/daft-local-execution/src/pipeline.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&filepath=src%2Fdaft-local-execution%2Fsrc%2Fpipeline.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtZXhlY3V0aW9uL3NyYy9waXBlbGluZS5ycw==) | `95.17% <ø> (ø)` | | | [src/daft-local-execution/src/run.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&filepath=src%2Fdaft-local-execution%2Fsrc%2Frun.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtZXhlY3V0aW9uL3NyYy9ydW4ucnM=) | `88.54% <ø> (ø)` | | | [.../daft-local-execution/src/sinks/hash\_join\_build.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&filepath=src%2Fdaft-local-execution%2Fsrc%2Fsinks%2Fhash_join_build.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtZXhlY3V0aW9uL3NyYy9zaW5rcy9oYXNoX2pvaW5fYnVpbGQucnM=) | `96.93% <ø> (ø)` | | | [...local-execution/src/sinks/outer\_hash\_join\_probe.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&filepath=src%2Fdaft-local-execution%2Fsrc%2Fsinks%2Fouter_hash_join_probe.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtZXhlY3V0aW9uL3NyYy9zaW5rcy9vdXRlcl9oYXNoX2pvaW5fcHJvYmUucnM=) | `97.26% <ø> (ø)` | | | [src/daft-local-plan/src/plan.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&filepath=src%2Fdaft-local-plan%2Fsrc%2Fplan.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtcGxhbi9zcmMvcGxhbi5ycw==) | `96.36% <100.00%> (ø)` | | | [src/daft-local-plan/src/translate.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&filepath=src%2Fdaft-local-plan%2Fsrc%2Ftranslate.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtcGxhbi9zcmMvdHJhbnNsYXRlLnJz) | `93.58% <ø> (ø)` | | | [src/daft-logical-plan/src/display.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&filepath=src%2Fdaft-logical-plan%2Fsrc%2Fdisplay.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9naWNhbC1wbGFuL3NyYy9kaXNwbGF5LnJz) | `98.04% <ø> (ø)` | | | [src/daft-logical-plan/src/lib.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&filepath=src%2Fdaft-logical-plan%2Fsrc%2Flib.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9naWNhbC1wbGFuL3NyYy9saWIucnM=) | `100.00% <ø> (ø)` | | | [src/daft-logical-plan/src/logical\_plan.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree&filepath=src%2Fdaft-logical-plan%2Fsrc%2Flogical_plan.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9naWNhbC1wbGFuL3NyYy9sb2dpY2FsX3BsYW4ucnM=) | `69.16% <ø> (ø)` | | | ... and [84 more](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | | ... and [18 files with indirect coverage changes](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3239/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)
jaychia commented 2 weeks ago

I put the physical plan into the existing daft-physical-plan crate, which currently contains the local physical plan as well. However if there is an argument for keeping them separate I can move the local plan into a daft-local-plan crate.

I think the physical plan is actually going to become a distributed-only physical plan soon. It might make sense to have it in daft-distributed-plan or something @colin-ho might have an opinion

kevinzwang commented 2 weeks ago

talked with Colin about it, I'll keep it split up then and have a daft-local-plan and daft-physical-plan for now, with the future plan of having a daft-distributed-plan and eliminating the daft-physical-plan altogether.

kevinzwang commented 2 weeks ago

But this PR will take the daft-physical-plan namespace and move the current code to daft-local-plan