Closed colin-ho closed 1 week ago
Comparing colin/read-generated
(70050af) with main
(7e89850)
⚡ 1
improvements
❌ 1
regressions
✅ 15
untouched benchmarks
:warning: Please fix the performance issues or acknowledge them on CodSpeed.
Benchmark | main |
colin/read-generated |
Change | |
---|---|---|---|---|
❌ | test_iter_rows_first_row[100 Small Files] |
226.7 ms | 346 ms | -34.5% |
⚡ | test_show[100 Small Files] |
41.9 ms | 23.6 ms | +77.77% |
Attention: Patch coverage is 0%
with 37 lines
in your changes missing coverage. Please review.
Project coverage is 77.58%. Comparing base (
6e28b3f
) to head (70050af
). Report is 11 commits behind head on main.
Files with missing lines | Patch % | Lines |
---|---|---|
daft/io/_generator.py | 0.00% | 37 Missing :warning: |
I guess this would be the purview of the
generator
being passed in?
I think so, that should likely be a parameter of the generator args or function itself
read_generator
takes in a generator function that yieldsTable
s, with an optional parameter ofnum_partitions
which will be the number of scan tasks that call this function.The function will be provided the partition number as the first argument, and whatever user args after that.
Useful for testing shuffles.