huttered40 / critter

Critical path analysis of MPI parallel programs
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Aggregate critter critical path costs before Reduction #9

Closed huttered40 closed 5 years ago

huttered40 commented 5 years ago

Currently, after a BSP step, we iterate over all MPI routines tracked by critter and find the max over five different metrics. This results in factor of NUM_CRITTERS more synchronizations than necessary.

_critter::compute_max_crit(...) should simply fill in the local costs to a window of an array. At the end of that loop, we can perform a single MPI_Allreduce, and then write back each reduced entry to the member variables of the corresponding MPI routine.