Closed huttered40 closed 5 years ago
No longer tracking MPI_Comm_split
. The communicator isn't even ready until after we perform the PMPI routine anyway.
So, this will now be part of computational overhead. May want to re-think this later.
Note that all the _critter::my_...
members need to be updated.
For each critter output for the std::cout
overload, we need to add the average metrics for each tracked routine.
Made major changes to tracking of per-process data. Needs to be checked for correctness.
Launched a cacqr2
job on Stampede2 that will help identify any obvious errors.
I have generated the plots. Need to inspect them to see if they make sense (both the critical paths, and the per-process paths).
I'm going to assume the overlap is correct for now, but will be performing much more rigorous tests in the future on real overlapping algorithms.
Although the critical path of individual routines are being tracked correctly, the routine-independent metrics (number of bytes, communication cost, estimated costs, etc) are not. This is because we are just adding up the costs of each collective and summing them.
We need to have each process contribute their routine-specific existing counts for each of these metrics and make that determine the current critical path, which is then propagated to the rest of the processes in that (sub)communicator.
MPI_Comm_split
is sketchy. We add to the total timers here (and those timers no longer truly exist).