Closed colin-ho closed 2 weeks ago
Comparing colin/preshufflemerge
(ab364ac) with main
(5115c32)
❌ 2
regressions
✅ 15
untouched benchmarks
:warning: Please fix the performance issues or acknowledge them on CodSpeed.
Benchmark | main |
colin/preshufflemerge |
Change | |
---|---|---|---|---|
❌ | test_iter_rows_first_row[100 Small Files] |
260.7 ms | 392 ms | -33.51% |
❌ | test_show[100 Small Files] |
42.4 ms | 50.7 ms | -16.45% |
Attention: Patch coverage is 74.03846%
with 54 lines
in your changes missing coverage. Please review.
Project coverage is 77.69%. Comparing base (
5115c32
) to head (ab364ac
). Report is 1 commits behind head on main.
Enables an experimental pre-shuffle merge strategy for shuffles.
Background
Our traditional shuffle strategy is a fully materializing map-reduce. Each of the
M
map partition tasks ends in a fanout, producingN
output partitions, and thenN
reduce partition tasks are ran to mergeM
outputs each.Algorithm
The pre-shuffle merge strategy works by merging
P
map partition tasks, before doing a fanout. TheN
reduce partition tasks will now only have to mergeM/P
outputs. The benefit here is that we can reduce the total number of intermediate objects by a factor ofP
, i.e. fromM * N
toM / P * N
.P
can be decided dynamically or statically. I chose to go with the dynamic route, together with a memory threshold of1gb
to cap the maximum merging size. Essentially, the pre-shuffle merge algorithm in this PR attempts to greedily merge map outputs as they are ready, as long as the combined size of the merged partitions is less than1gb
.Tests
1000 x 1000 shuffle of 100mb partitions,
2000 x 2000 shuffle of 100mb partitions
Todos