When initialize DistributedVector zero-vector, at sc.parallelize, add sleep(1000) into each map task so that it will help spread task into the whole cluster, or the tasks may be aggregates in a few cluster nodes, in my test.
This is a walk-round way for now, in the future if there is better way I will update it.
What changes were proposed in this pull request?
When initialize
DistributedVector
zero-vector, atsc.parallelize
, addsleep(1000)
into each map task so that it will help spread task into the whole cluster, or the tasks may be aggregates in a few cluster nodes, in my test.This is a walk-round way for now, in the future if there is better way I will update it.
How was this patch tested?
Manual.