Eventual-Inc / Daft

Distributed data engine for Python/SQL designed for the cloud, powered by Rust
https://getdaft.io
Apache License 2.0
2.39k stars 170 forks source link

[CHORE] Remove daft-scan dependency from planning crates #3250

Closed kevinzwang closed 2 weeks ago

kevinzwang commented 2 weeks ago

This PR creates a better abstraction over scan tasks and uses that abstraction in the planning crates (daft-logical-plan, daft-physical-plan, daft-local-plan). This is done by creating a new common-scan-info crate which holds a ScanTaskLike trait which planning crates can use to access scan task info without needing knowledge of implementation.

Now, in our planning stages, we solely use our ScanTaskLike abstraction, and then in execution, we downcast to ScanTask to actually materialize them. That way, we can actually remove the daft-scan dependency from planning crates.

codspeed-hq[bot] commented 2 weeks ago

CodSpeed Performance Report

Merging #3250 will degrade performances by 51.94%

Comparing kevin/swap-plan-scan-dependency (0429eea) with main (16f5a8c)

Summary

❌ 1 regressions ✅ 16 untouched benchmarks

:warning: Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main kevin/swap-plan-scan-dependency Change
test_show[100 Small Files] 23.7 ms 49.3 ms -51.94%
codecov[bot] commented 2 weeks ago

Codecov Report

Attention: Patch coverage is 52.88303% with 286 lines in your changes missing coverage. Please review.

Project coverage is 77.62%. Comparing base (f290f40) to head (0429eea). Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-scan/src/builder.rs 17.54% 94 Missing :warning:
src/common/scan-info/src/test/mod.rs 39.32% 54 Missing :warning:
src/daft-scan/src/lib.rs 20.93% 34 Missing :warning:
src/common/scan-info/src/partitioning.rs 40.81% 29 Missing :warning:
src/common/scan-info/src/pushdowns.rs 69.89% 28 Missing :warning:
src/daft-scan/src/anonymous.rs 0.00% 18 Missing :warning:
src/common/scan-info/src/python.rs 82.14% 15 Missing :warning:
src/common/scan-info/src/scan_operator.rs 50.00% 6 Missing :warning:
src/common/scan-info/src/scan_task.rs 0.00% 4 Missing :warning:
src/daft-physical-plan/src/ops/scan.rs 25.00% 3 Missing :warning:
... and 1 more
Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250/graphs/tree.svg?width=650&height=150&src=pr&token=J430QVFE89&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) ```diff @@ Coverage Diff @@ ## main #3250 +/- ## ========================================== + Coverage 76.07% 77.62% +1.55% ========================================== Files 644 658 +14 Lines 81581 80527 -1054 ========================================== + Hits 62065 62512 +447 + Misses 19516 18015 -1501 ``` | [Files with missing lines](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | Coverage Δ | | |---|---|---| | [daft/logical/builder.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&filepath=daft%2Flogical%2Fbuilder.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9sb2dpY2FsL2J1aWxkZXIucHk=) | `89.87% <100.00%> (ø)` | | | [src/common/scan-info/src/expr\_rewriter.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&filepath=src%2Fcommon%2Fscan-info%2Fsrc%2Fexpr_rewriter.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2NvbW1vbi9zY2FuLWluZm8vc3JjL2V4cHJfcmV3cml0ZXIucnM=) | `47.87% <ø> (ø)` | | | [src/common/scan-info/src/lib.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&filepath=src%2Fcommon%2Fscan-info%2Fsrc%2Flib.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2NvbW1vbi9zY2FuLWluZm8vc3JjL2xpYi5ycw==) | `100.00% <100.00%> (ø)` | | | [src/daft-local-execution/src/pipeline.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&filepath=src%2Fdaft-local-execution%2Fsrc%2Fpipeline.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtZXhlY3V0aW9uL3NyYy9waXBlbGluZS5ycw==) | `95.30% <100.00%> (+8.34%)` | :arrow_up: | | [src/daft-local-execution/src/sources/scan\_task.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&filepath=src%2Fdaft-local-execution%2Fsrc%2Fsources%2Fscan_task.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtZXhlY3V0aW9uL3NyYy9zb3VyY2VzL3NjYW5fdGFzay5ycw==) | `75.55% <ø> (ø)` | | | [src/daft-local-plan/src/plan.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&filepath=src%2Fdaft-local-plan%2Fsrc%2Fplan.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtcGxhbi9zcmMvcGxhbi5ycw==) | `96.38% <100.00%> (+13.21%)` | :arrow_up: | | [src/daft-local-plan/src/translate.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&filepath=src%2Fdaft-local-plan%2Fsrc%2Ftranslate.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtcGxhbi9zcmMvdHJhbnNsYXRlLnJz) | `93.58% <100.00%> (-0.05%)` | :arrow_down: | | [src/daft-logical-plan/src/builder.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&filepath=src%2Fdaft-logical-plan%2Fsrc%2Fbuilder.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9naWNhbC1wbGFuL3NyYy9idWlsZGVyLnJz) | `92.11% <ø> (+11.46%)` | :arrow_up: | | [src/daft-logical-plan/src/lib.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&filepath=src%2Fdaft-logical-plan%2Fsrc%2Flib.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9naWNhbC1wbGFuL3NyYy9saWIucnM=) | `100.00% <ø> (ø)` | | | [src/daft-logical-plan/src/ops/source.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree&filepath=src%2Fdaft-logical-plan%2Fsrc%2Fops%2Fsource.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9naWNhbC1wbGFuL3NyYy9vcHMvc291cmNlLnJz) | `55.81% <ø> (ø)` | | | ... and [25 more](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | | ... and [25 files with indirect coverage changes](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3250/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)
universalmind303 commented 2 weeks ago

@kevinzwang I took another look at this PR, and maybe it can be done in a separate PR, but we also need to invert the dependency between daft-table and daft-logical-plan.

kevinzwang commented 2 weeks ago

@kevinzwang I took another look at this PR, and maybe it can be done in a separate PR, but we also need to invert the dependency between daft-table and daft-logical-plan.

I'll take a look at it, if it's a complex task i'll merge this one in first