cocreature / thrill

Thrill - An EXPERIMENTAL Algorithmic Distributed Big Data Batch Processing Framework in C++
http://project-thrill.org
Other
0 stars 0 forks source link

Parallelize result computation #7

Open cocreature opened 7 years ago

cocreature commented 7 years ago

Currently we only parallelize the aggregration phase but the result computation, in particular the calculation of E is not parallelized. For small precisions this is probably not worth it but for e.g. 16 we are summing calculating std::pow and summing the results 2^16 times so it probably makes sense to parallelize this.

I’m not quite sure what the best way to accomplish this is. Do we need two nodes for this?

TiFu commented 7 years ago

I am a bit confused about what's going on when you call HyperLogLog().

Is my understanding correct?

In that case what we need to do to parallelize the result calculation is

  1. Scatter the entries of the reducedRegisters
  2. Calculate 2^-entry on each node
  3. AllReduce
    • returning a Par<double, unsigned int>
    • double is the result value (variable E)
    • unsigned int is V

In thrill terms:

  1. Create DIA
  2. Map: registerEntry => pair<double, unsigned int>
  3. AllReduce
cocreature commented 7 years ago

I don’t think your explanation is correct. After the call to context_.net.AllReduce each worker has the same result and they all calculate the result. This is the reason for the results being printed once for each worker.

The steps for parallelizing this look correct. The operation for scattering the vector is called Distribute in Thrill. It probably also makes sense to make this optional since I would expect the overhead to be too large for small precisions.

TiFu commented 7 years ago

Implemented in 54f0cd5256

We still need to figure out at which size we want to enable the parallelization.