Gwinel / likwid

Automatically exported from code.google.com/p/likwid
GNU General Public License v3.0
0 stars 0 forks source link

How to use marker API in perfctr to measure interleaved parallel and sequential code sections #100

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
My application has interleaved parallel and sequential sections of code, and I 
would like to measure the performance counters for both the parallel and 
sequential sections together. What I am trying to do would look something like 
this:

//serial code
likwid_markerInit();
omp_set_num_threads(n);
//more serial code
likwid_markerStartRegion("Measure");
//serial code to measure
#pragma omp parallel
    //parallel code to measure
//more serial code to measure
#pragma omp parallel
    //parallel code to measure
//even more serial code to measure
likwid_markerStopRegion("Measure");
likwid_markerClose();

Is this possible and how can it be done? 

The only way I can think to get the result would be to mark the code with 
serial regions and run likwid-perfctr, then mark the code with parallel regions 
and run likwid-perfctr again. Then combine the performance counter results for 
the main thread (thread 0). 

What version of the product are you using?
likwid-perfctr v3.0

Original issue reported on code.google.com by nicolerodia@gmail.com on 19 Mar 2013 at 1:15

GoogleCodeExporter commented 9 years ago
This scenario is not possible at the moment except you are using a  workaround 
as you have described. Still I agree this is a artificial restriction which is 
annoying. I will think of a solution for next release with a more relaxed 
behavior. Till then you have to do it as you described with multiple runs.

Original comment by jan.trei...@gmail.com on 20 Mar 2013 at 9:18

GoogleCodeExporter commented 9 years ago
Adopt Marker API to accept variable number of threads entering the regions and 
remove thread check in perfctr.

Original comment by jan.trei...@gmail.com on 8 May 2013 at 8:29

GoogleCodeExporter commented 9 years ago
Should be fixed with revision 00df95db8bec . 

Now sequential and parallel regions can be mixed and not all threads need to 
enter a region.

Original comment by jan.trei...@gmail.com on 12 Sep 2013 at 1:42