RRZE-HPC / likwid

Performance monitoring and benchmarking suite
https://hpc.fau.de/research/tools/likwid/
GNU General Public License v3.0
1.65k stars 226 forks source link

DaCe + Likwid #475

Open lukastruemper opened 2 years ago

lukastruemper commented 2 years ago

Hello,

I just wanted to let you know that we integrated Likwid into the codegen of our parallel programing framework DaCe. This means that users can now instrument their DaCe programs by just setting a flag on the SDFG (our intermediate representation). We're working on a new docu for DaCe, but you might want to check out our sample.

The whole integration is still experimental and was just merged into main. We would be very happy to receive comments/feedback. Thanks for your awesome tool!

Cheers, Lukas

TomTheBear commented 2 years ago

Hi Lukas,

thanks for the great news!

Thanks for your efforts to integrate LIKWID in such a important framework.

Best regards, Thomas

lukastruemper commented 2 years ago

Thanks for the feedback!

Why InstrumentationType.LIKWID_Counters and not just InstrumentationType.LIKWID?

We already had a flag PAPI_Counters, so we kept it consistent. But we're considering changing it to make it cleaner.

Can there be a more handy way to get the list of supported groups for the platform? Looking up the list in the LIKWID repo might not be enough for a system. It might provide more or less groups. Also own custom groups ($HOME/.likwid/groups/) are not listed there.

Good point, I'll add a method to retrieve the list of the available groups from python.

Are multiple groups possible in LIKWID_EVENTS env variable? (just for curiosity, no need to support that!)

We're currently just passing the variable on to Likwid (as in the internal-markerAPI example). According to my tests, it doesn't support this right now

Since you work with groups (the default case), I would show how to get the list of metrics out of the report. First a list of available metrics and secondly how to access one metric for a single core and all cores.

This is indeed the most critical feature that we want to support. I guess, since we only support a single group right now, it is not too come complex (the currently active group is the only one measured). Would be cool to see how we can get this through perfmon calls.

Does the report look like the default LIKWID MarkerAPI tables or is it an own format? If it is an own format, I would show some excerpt in the comments.

Good point, I will add a figure to the docu that we're currently creating and some minimal excerpt to the sample.

Do OMP_NUM_THREADS need to be set "from the outside"? Is there no way to specify parallelism inside the code? Can I set LIKWID_EVENTS in the code or is it read at startup?

LIKWID_EVENTS is read at the code generation (sdfg.compile in the python code), so you can actually set it in python before calling .compile. We're currently converting the maps in our intermediate representation to loops with OMP pragmas, but we can only set the number of threads globally with OMP_NUM_THREADS. We're working on supporting more dynamic schemas in the future.

Whenever you run a DaCe program that calls .compile, you can inspect the generated code in the .dacecache/src/*.cpp file of your current working directory.

TomTheBear commented 2 years ago

Are multiple groups possible in LIKWID_EVENTS env variable? (just for curiosity, no need to support that!)

We're currently just passing the variable on to Likwid (as in the internal-markerAPI example). According to my tests, it doesn't support this right now

The internal-markerAPI example contains '|' between groups/eventsets. So it's already possible to specify them but DaCe has to switch between them.

Since you work with groups (the default case), I would show how to get the list of metrics out of the report. First a list of available metrics and secondly how to access one metric for a single core and all cores.

This is indeed the most critical feature that we want to support. I guess, since we only support a single group right now, it is not too come complex (the currently active group is the only one measured). Would be cool to see how we can get this through perfmon calls.

After you read in the MarkerAPI file, you can use the common functions:

err = perfmon_readMarkerFile(getenv("LIKWID_FILEPATH"));
for (t = 0; t < NUM_THREADS; t++)  {
  for (i = 0; i < perfmon_getNumberOfRegions(); i++) {
    int gid = perfmon_getGroupOfRegion(i);
    for (k = 0; k < perfmon_getNumberOfMetrics(gid); k++)  {
       char* metric_name = perfmon_getMetricName(gid, k);
       double result = metric_value = perfmon_getMetricOfRegionThread(i, k, t);
    }
  }
}

see https://github.com/RRZE-HPC/likwid/blob/master/examples/C-internalMarkerAPI.c#L436