lincc-frameworks / tape

[Deprecated] Package for working with LSST time series data
https://tape.readthedocs.io
MIT License
12 stars 3 forks source link

Have head() traverse all partitions #419

Closed wilsonbb closed 5 months ago

wilsonbb commented 5 months ago

As was pointed out in https://github.com/lincc-frameworks/tape/issues/380, it is not uncommon for partitions to be empty, which can lead to unintuitive results for users since head by default only looks at the first partitions.

To make head(n) more intuitive for users we can have it search across all partitions stopping only once there are n rows to return. This differs from Dask behavior where a user must explicitly choose how many partitions to search across.

Warning: this removes the npartitions argument previously available in EnsembleFrame.head

Solution Description

Here we iterate over each partition similar to how head() until we have found enough rows similar to the implementation in LSDB

Code Quality

New Feature Checklist

review-notebook-app[bot] commented 5 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

github-actions[bot] commented 5 months ago
Before [d0af3ce0] After [a693f5d0] Ratio Benchmark (Parameter)
33.5±0.4ms 33.6±0.5ms 1 benchmarks.time_batch
34.7±0.4ms 33.9±0.2ms 0.98 benchmarks.time_prune_sync_workflow

Click here to view all benchmarks.

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 95.82%. Comparing base (d0af3ce) to head (cd026ca).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #419 +/- ## ========================================== + Coverage 95.78% 95.82% +0.03% ========================================== Files 25 25 Lines 1755 1771 +16 ========================================== + Hits 1681 1697 +16 Misses 74 74 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.