Eventual-Inc / Daft

Distributed data engine for Python/SQL designed for the cloud, powered by Rust
https://getdaft.io
Apache License 2.0
2.38k stars 170 forks source link

[BUG] Cleanup context side-effects #3270

Closed jaychia closed 2 weeks ago

jaychia commented 2 weeks ago

Cleans up public-facing APIs on our context object that has side-effects.

We should be very explicit in our context code when making state changes. Eventually I think I want to deprecate the usage of daft.context.* from users, and instead expose a daft.connect(...) which will prevent user footguns.

Tests added in: https://github.com/Eventual-Inc/Daft/pull/3275

Changes

  1. In tests, use get_tests_daft_runner_name for checking the runner instead of relying on the context
  2. Refactors our context to be VERY explicit about when the runner is created (get_context().get_or_create_runner)
  3. I then refactored everywhere that was asking about state in the context to first call get_or_create_runner, then access information about the state.

Note that this might be a behavior change in some cases, but I think that this is for the better. I.e. in order to access information about the runner (mostly about what type of runner it is), we force clients to materialize the runner.

I think in the future we should (after more testing) also refactor to:

codspeed-hq[bot] commented 2 weeks ago

CodSpeed Performance Report

Merging #3270 will improve performances by ×2.2

Comparing jay/context-side-effects (39f238c) with main (bd4e944)

Summary

⚡ 2 improvements ✅ 15 untouched benchmarks

Benchmarks breakdown

Benchmark main jay/context-side-effects Change
test_iter_rows_first_row[100 Small Files] 386.7 ms 236.9 ms +63.25%
test_show[100 Small Files] 50.7 ms 22.9 ms ×2.2
codecov[bot] commented 2 weeks ago

Codecov Report

Attention: Patch coverage is 89.65517% with 3 lines in your changes missing coverage. Please review.

Project coverage is 77.56%. Comparing base (c77cfdb) to head (39f238c). Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
daft/context.py 72.72% 3 Missing :warning:
Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270/graphs/tree.svg?width=650&height=150&src=pr&token=J430QVFE89&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) ```diff @@ Coverage Diff @@ ## main #3270 +/- ## ========================================== - Coverage 77.58% 77.56% -0.02% ========================================== Files 659 665 +6 Lines 80562 80897 +335 ========================================== + Hits 62505 62750 +245 - Misses 18057 18147 +90 ``` | [Files with missing lines](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | Coverage Δ | | |---|---|---| | [daft/analytics.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fanalytics.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9hbmFseXRpY3MucHk=) | `75.92% <ø> (-0.23%)` | :arrow_down: | | [daft/dataframe/dataframe.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fdataframe%2Fdataframe.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9kYXRhZnJhbWUvZGF0YWZyYW1lLnB5) | `86.65% <100.00%> (+0.05%)` | :arrow_up: | | [daft/datatype.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fdatatype.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9kYXRhdHlwZS5weQ==) | `91.89% <100.00%> (ø)` | | | [daft/expressions/expressions.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fexpressions%2Fexpressions.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9leHByZXNzaW9ucy9leHByZXNzaW9ucy5weQ==) | `93.60% <100.00%> (ø)` | | | [daft/io/\_deltalake.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fio%2F_deltalake.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9pby9fZGVsdGFsYWtlLnB5) | `76.92% <100.00%> (ø)` | | | [daft/io/\_hudi.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fio%2F_hudi.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9pby9faHVkaS5weQ==) | `100.00% <100.00%> (ø)` | | | [daft/io/\_iceberg.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fio%2F_iceberg.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9pby9faWNlYmVyZy5weQ==) | `85.71% <100.00%> (ø)` | | | [daft/io/\_parquet.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fio%2F_parquet.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9pby9fcGFycXVldC5weQ==) | `85.71% <100.00%> (-0.50%)` | :arrow_down: | | [daft/io/file\_path.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fio%2Ffile_path.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9pby9maWxlX3BhdGgucHk=) | `100.00% <100.00%> (ø)` | | | [daft/runners/native\_runner.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Frunners%2Fnative_runner.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9ydW5uZXJzL25hdGl2ZV9ydW5uZXIucHk=) | `95.83% <100.00%> (+0.08%)` | :arrow_up: | | ... and [5 more](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | | ... and [20 files with indirect coverage changes](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)