Closed EricWohlstadter closed 6 years ago
Migrate the logic from master branch that specially handles .count() action.
.count()
Instead of using parallelize, when the count value from HS2 is X:
X
COUNT_TASKS
(COUNT_TASKS - 1)
X/(COUNT_TASKS - 1)
X % (COUNT_TASKS - 1)
UT
@EricWohlstadter, please feel free to merge it as you want. I locally verified this anyway.
What changes were proposed in this pull request?
Migrate the logic from master branch that specially handles
.count()
action.Instead of using parallelize, when the count value from HS2 is
X
:COUNT_TASKS
num of tasks (configurable)(COUNT_TASKS - 1)
tasks generateX/(COUNT_TASKS - 1)
rows.X % (COUNT_TASKS - 1)
rowsHow was this patch tested?
UT