Closed ntippman closed 10 months ago
Thanks for reporting.
I'm surprised by your workaround because i == 0
, so you switch to the same group (gid[i]
) after setting up & starting it. The switch group function has an early return if new_group == groupSet->activeGroup
, so perfmon_switchActiveGroup(gids[0]);
should do nothing.
Have you tried using the logic from examples/monitoring.c
?
perfmon_setupCounters(gid);
perfmon_startCounters();
sleep(sleeptime);
perfmon_stopCounters();
// save results, here just printing
for (c = 0; c < numCpus; c++)
{
for (i = 0; i< perfmon_getNumberOfMetrics(gid); i++)
{
printf("%s,cpu=%d %f\n", perfmon_getMetricName(gid, i), cpus[c], perfmon_getLastMetric(gid, i, c));
}
}
Since you are using the direct
access mode, many counter accesses will be performed through the rdpmc
instruction. Can you test how it behaves when not using rdpmc
(comment line https://github.com/RRZE-HPC/likwid/blob/master/src/access_x86_msr.c#L147)?
With your hint that perfmon_switchActiveGroup(gids[0])
should not do anything in my case helped me find the issue. I logged the activeGroup
and whether a switch was performed or skipped.
It turns out that perfmon_switchActiveGroup(gids[0])
actually does perform the switch after the first interval. But my own code is at fault for that, not LIKWID.
Let's assume that gids = [0,1]
. After the first iteration of the while-loop the activeGroup
is 1
. When perfmon_startCounters
is called the activeGroup
is still 1
and therefore gid 0
is never again measured after the first iteration. This also explains why perfmon_switchActiveGroup(gids[0])
does actually work. And it also becomes clear why gid 0
reports values if there was a load on the system when starting the measurement - because it keeps reporting the same values from the one initial measurement...
I guess there are three solutions for this.
Either never call perfmon_startCounters
and let perfmon_switchActiveGroup
handle everything. Which works but may not be the cleanest solution...
while(1){
for (size_t i = 0; i< gids.size(); i++) {
perfmon_switchActiveGroup(gids[i]);
usleep(setTime);
}
perfmon_stopCounters();
}
Or always perform a proper perfmon_setupCounters(gids[0])
while(1){
for (size_t i = 0; i< gids.size(); i++) {
if (i== 0){
perfmon_setupCounters(gids[0])
perfmon_startCounters());
} else
perfmon_switchActiveGroup(gids[i]);
usleep(setTime);
}
perfmon_stopCounters();
}
Or as recommended by examples/monitoring.c
while(1){
for (size_t i = 0; i< gids.size(); i++) {
perfmon_setupCounters(gids[i])
perfmon_startCounters());
usleep(setTime);
perfmon_stopCounters();
}
}
Are there any benefits to calling setup/start/stop
for each gid
instead of using perfmon_switchActiveGroup
?
Thank your for your help!
The perfmon_switchActiveGroup
code itself is basically a (if running -> stop(current)), setup(new), start(new), current = new
, so in the end it does not matter whether you use perfmon_switchActiveGroup
or any of the other approaches. IMHO the if-else in the second approach looks complicated, the third is the most clean.
Describe the bug I use the C-API to continuously monitor multiple event groups and thus switch between them. I noticed that some of the values were completely off and it seems to only affect the first event group, regardless of the group itself. But only, if there was no initial load on the system at the start of measurement...
Tests were conducted on 2x Intel 6252 and 2x Intel 8360Y.
For my reproducer I am measuring two groups, CLOCK and FLOPS_SP. I will focus on the metric
Clock [MHz]
which is present in both groups. However, the bug affects all metrics of the group. This bug will only occur when at least 2 groups are measured.When there is no load on the server LIKWID mainly reports
nan
values which is understandable and not an issue: (the values are for each thread, output shortened)When applying a load (
FIRESTARTER
) I would expect that both groups now report proper values. Unfortunately, the first group will never report reasonable values while the second group now does:This is regardless of how long the load was applied. Measurement interval is 5 seconds, so roughly 2.5 seconds for each group.
If I start a load before starting the measurements, everything is fine and all metrics of all groups report correct values. This only happens when there was no load at the start of measurement and only with at least one switch between groups!
To Reproduce C-API of current
master
is used onAlmaLinux 9.2
.General code (simplified):
Workaround
There seems to be something wrong with
perfmon_startCounters
orperfmon_switchActiveGroup
on idle machines. One workaround I found was to callperfmon_switchActiveGroup
right afterperfmon_startCounters
but on the same event group.With this workaround all reported values of all groups are correct, regardless of the node being idle upon measurement start.
I am not sure if my assumptions are correct, but at least the bug is reproducible on my setup. Please find attached the output of my reproducer (with verbosity 1 and 3).
likwid_output_verbosity1.txt likwid_output_verbosity3.txt