Closed jaychia closed 2 weeks ago
Comparing jay/context-side-effects
(39f238c) with main
(bd4e944)
⚡ 2
improvements
✅ 15
untouched benchmarks
Benchmark | main |
jay/context-side-effects |
Change | |
---|---|---|---|---|
⚡ | test_iter_rows_first_row[100 Small Files] |
386.7 ms | 236.9 ms | +63.25% |
⚡ | test_show[100 Small Files] |
50.7 ms | 22.9 ms | ×2.2 |
Attention: Patch coverage is 89.65517%
with 3 lines
in your changes missing coverage. Please review.
Project coverage is 77.56%. Comparing base (
c77cfdb
) to head (39f238c
). Report is 7 commits behind head on main.
Files with missing lines | Patch % | Lines |
---|---|---|
daft/context.py | 72.72% | 3 Missing :warning: |
Cleans up public-facing APIs on our context object that has side-effects.
We should be very explicit in our context code when making state changes. Eventually I think I want to deprecate the usage of
daft.context.*
from users, and instead expose adaft.connect(...)
which will prevent user footguns.Tests added in: https://github.com/Eventual-Inc/Daft/pull/3275
Changes
get_tests_daft_runner_name
for checking the runner instead of relying on the contextget_context().get_or_create_runner
)get_or_create_runner
, then access information about the state.Note that this might be a behavior change in some cases, but I think that this is for the better. I.e. in order to access information about the runner (mostly about what type of runner it is), we force clients to materialize the runner.
I think in the future we should (after more testing) also refactor to:
daft.context.set_runner_*
is called, we can just straight up set the runner. Not sure why we need to lazily initialize the runner here.DataFrame.__init__
,Expression.__init__
etc. This is in effect saying -- when you use any Daft APIs, we are going to capture the state of runner configurations and setting the runner.