Closed kadircs closed 5 months ago
The fixed counters on AMD sometimes do not work. This is not limited to Cray EX systems but is a more general setting. I have not found a way to check the usability of the fixed counters and I do not recommend their usage as they are not really accurate (in my experience).
The remaining counts for the general-purpose counters (PMC*
) work and should give you the information you want.
I just want to get the rooflines. Is there a tutorial to obtain the rooflines using the PMC* counters?
I am trying to get my application's L1, L2, and LLC data traffic. The tutorial you shared showcases only DRAM traffic.
L1 data traffic is difficult/impossible. no suitable events for almost all platforms. L2 and LLC traffic should work.
The tutorial showcases only DRAM traffic, that's right, but the other layers are comparable. You need the bandwidth_limit_of_level_X
for the roofline and the operational intensity (FP_rate / measured_bandwidth_for_level_X
) for the application dot. You can derive the bandwidth limit from the data sheet or measure it with likwid-bench
. For private caches, you should use -W N:<numThreads * half_size_of_level_X_for_single_thread>:<numThreads>
.
@kadircs, how did you manage to install it for ACCESSMODE=accessdaemon
without root privileges inside that HPC-machine?
I am getting this error when I do $make install
:
$ make install ===> INSTALL access daemon to /likwid_folder/likwid_install_dir/sbin/likwid-accessD install: cannot change ownership of '/likwid_folder/likwid_install_dir/sbin/likwid-accessD': Operation not permitted make: *** [Makefile:394: install_daemon] Error 1
Thank you a lot!
PS : @TomTheBear, if you have any comments on this, I would greatly appreciate them.
@iustinouatu you need to specify a installation directory that you have access to.
Hi! Sorry to be reviving this! I am having the same issue, so asking just to clarify. I do not have root privileges anywhere in the HPC cluster.
I understand that @kadircs refers to a folder where you have root rights?
Please correct me if I am wrong.
Best, George
As far as I know, @kadircs got an administrator to install with ACCESSMODE=accessdaemon
on some selected nodes. He reported back through other channels which led to the fixes in https://github.com/RRZE-HPC/likwid/pull/618
There are only two ways to get memory traffic:
amd_df
unit exists -> LIKWID with ACCESSMODE=perf_event
(Installation with user permissions)amd_df
unit does not exist -> LIKWID with ACCESSMODE=accessdaemon
(Installation only with root permissions) OR more recent Kernel versionOf course, there are more complicated setups but all require interaction with the sysadmins.
I see then, thanks.
Just to be clear. I am not focusing specifically on AMD hardware. So, my understanding is that some sort of admin rights on a node is definitely needed, right?
Just to be clear. I am not focusing specifically on AMD hardware. So, my understanding is that some sort of admin rights on a node is definitely needed, right?
No not in all cases. If you choose ACCESSMODE=perf_event
, you can install as user. Then it depends how restricted your system is configured (/proc/sys/kernel/perf_event_paranoid
, lower -> more possibilities). For the Roofline Model, you need memory traffic measurements which require 0
or -1
(not recommended by me).
Some computing centers provide special job submission options to allow measurements (reduce the paranoid value). I documented that here how we do it in our center. I know other centers have something similar.
Many thanks for this helpful response! I will have a look at our computing center as soon as possible! Thanks again!
I am trying to get roofline model using likwid on a CRAY EX system. I tried user-space installation without root privileges in the PrgEnv-cray environment. I tried following both methods:
I am getting
Setup of event ACTUAL_CPU_CLOCK on CPU 0 failed: Permission denied
error as seen below:Would you please help?