huttered40 / critter

Critical path analysis of MPI parallel programs
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Further diagnose per-process timing issues #42

Closed huttered40 closed 4 years ago

huttered40 commented 4 years ago

There are massive differences between critical path times and per-process times, but the per-process is sometimes much greater than the critical path, which doesn't make any sense.

What is going on here?

huttered40 commented 4 years ago

I removed the line adding the idle time to the per-process runtime, but now I'm actually seeing the per-process communication time being significantly higher than the critical path communication time, just on a single node interactive job. This has to be a bug somewhere.

huttered40 commented 4 years ago

I think I fixed the issue. I attribute this to propogation of timing granularity issues, as well as one dumb bug. I just forced the per-process to always be less than the critical path for timing measures by construction.