[BUG] Cleanup context side-effects

jaychia commented 2 weeks ago

Cleans up public-facing APIs on our context object that has side-effects.

We should be very explicit in our context code when making state changes. Eventually I think I want to deprecate the usage of daft.context.* from users, and instead expose a daft.connect(...) which will prevent user footguns.

Tests added in: https://github.com/Eventual-Inc/Daft/pull/3275

Changes

In tests, use get_tests_daft_runner_name for checking the runner instead of relying on the context
Refactors our context to be VERY explicit about when the runner is created (get_context().get_or_create_runner)
I then refactored everywhere that was asking about state in the context to first call get_or_create_runner, then access information about the state.

Note that this might be a behavior change in some cases, but I think that this is for the better. I.e. in order to access information about the runner (mostly about what type of runner it is), we force clients to materialize the runner.

I think in the future we should (after more testing) also refactor to:

[ ] Let's remove the concept of RunnerConfig. When daft.context.set_runner_* is called, we can just straight up set the runner. Not sure why we need to lazily initialize the runner here.
[ ] We can force initialization of the Runner at various key entrypoints, namely DataFrame.__init__, Expression.__init__ etc. This is in effect saying -- when you use any Daft APIs, we are going to capture the state of runner configurations and setting the runner.

codspeed-hq[bot] commented 2 weeks ago

CodSpeed Performance Report

Merging #3270 will improve performances by ×2.2

_{Comparing jay/context-side-effects (39f238c) with main (bd4e944)}

Summary

⚡ 2 improvements ✅ 15 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`jay/context-side-effects`	Change
⚡	`test_iter_rows_first_row[100 Small Files]`	386.7 ms	236.9 ms	+63.25%
⚡	`test_show[100 Small Files]`	50.7 ms	22.9 ms	×2.2

codecov[bot] commented 2 weeks ago

Codecov Report

Attention: Patch coverage is 89.65517% with 3 lines in your changes missing coverage. Please review.

Project coverage is 77.56%. Comparing base (c77cfdb) to head (39f238c). Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
daft/context.py	72.72%	3 Missing :warning:

Additional details and impacted files

[![Impacted file tree graph](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270/graphs/tree.svg?width=650&height=150&src=pr&token=J430QVFE89&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) ```diff @@ Coverage Diff @@ ## main #3270 +/- ## ========================================== - Coverage 77.58% 77.56% -0.02% ========================================== Files 659 665 +6 Lines 80562 80897 +335 ========================================== + Hits 62505 62750 +245 - Misses 18057 18147 +90 ``` | [Files with missing lines](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | Coverage Δ | | |---|---|---| | [daft/analytics.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fanalytics.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9hbmFseXRpY3MucHk=) | `75.92% <ø> (-0.23%)` | :arrow_down: | | [daft/dataframe/dataframe.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fdataframe%2Fdataframe.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9kYXRhZnJhbWUvZGF0YWZyYW1lLnB5) | `86.65% <100.00%> (+0.05%)` | :arrow_up: | | [daft/datatype.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fdatatype.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9kYXRhdHlwZS5weQ==) | `91.89% <100.00%> (ø)` | | | [daft/expressions/expressions.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fexpressions%2Fexpressions.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9leHByZXNzaW9ucy9leHByZXNzaW9ucy5weQ==) | `93.60% <100.00%> (ø)` | | | [daft/io/\_deltalake.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fio%2F_deltalake.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9pby9fZGVsdGFsYWtlLnB5) | `76.92% <100.00%> (ø)` | | | [daft/io/\_hudi.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fio%2F_hudi.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9pby9faHVkaS5weQ==) | `100.00% <100.00%> (ø)` | | | [daft/io/\_iceberg.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fio%2F_iceberg.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9pby9faWNlYmVyZy5weQ==) | `85.71% <100.00%> (ø)` | | | [daft/io/\_parquet.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fio%2F_parquet.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9pby9fcGFycXVldC5weQ==) | `85.71% <100.00%> (-0.50%)` | :arrow_down: | | [daft/io/file\_path.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Fio%2Ffile_path.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9pby9maWxlX3BhdGgucHk=) | `100.00% <100.00%> (ø)` | | | [daft/runners/native\_runner.py](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree&filepath=daft%2Frunners%2Fnative_runner.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-ZGFmdC9ydW5uZXJzL25hdGl2ZV9ydW5uZXIucHk=) | `95.83% <100.00%> (+0.08%)` | :arrow_up: | | ... and [5 more](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | | ... and [20 files with indirect coverage changes](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3270/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)

Eventual-Inc / Daft