Open jscarretero opened 8 years ago
Greetings folks,
One needs to be very careful when using multiplexing in libraries such as IPM. In PAPI, multiplexing can be done in users base with alarms on signals, or internal space using Kernel timers. We default of the latter as it's a built-in part of the functionality of the perf events subsystem.
The only way multiplexing really make sense is when the granularity on the measurements is on the order of one second or greater. In other words, if you're instrumenting long-running sections of code.
If one takes measurements in very small time quanta, One needs to be aware that depending on the number of counters being used, a number of them may not have been scheduled at all during that interval. I believe the default interval in perf events maybe 100 hertz but One would need to check the kernel to be sure.
Phil
Apologies for brevity and errors as this was sent from my mobile device.
On May 26, 2016, at 14:19, jscarretero notifications@github.com wrote:
Dear Admin,
I guess that PAPI event multiplexing is not a very used IPM feature and that it might be experimental. I have been toying with it and have enabled it by uncommenting /* #define USE_PAPI_MULTIPLEXING */ and by setting MAXNUM_PAPI_EVENTS to 48, MAXNUM_PAPI_COUNTERS to 32, MAXSIZE_PAPI_EVTNAME to 45 and MAXSIZE_ENVKEY to 2048.
When I try to profile a simple test MPI matrix multiplication program (https://github.com/mperlet/matrix_multiplication) having enabled IPM_HPM=IPM_HPM=UOPS_EXECUTED_PORT:PORT_0,PAPI_TOT_CYC,DTLB_LOAD_MISSES:MISS_CAUSES_A_WALK,AVX:ALL,MEM_UOPS_RETIRED:ALL_LOADS,UOPS_EXECUTED:CORE,PAGE-FAULTS:u=0,PAPI_L3_TCM (some randomly selected events) I was getting negative numbers for MEM_UOPS_RETIRED:ALL_LOADS and UOPS_EXECUTED:CORE.
I traced to problem to be occurring in rv = PAPI_set_multiplex(papi_evtset[comp].evtset); inside ipm_papi_startfrom mod_papi.c. I then modified the code embedded in the USE_PAPI_MULTIPLEXING define (for the imp_papi_startfunction) to be like:
ifdef USE_PAPI_MULTIPLEXING
rv = PAPI_assign_eventset_component(papi_evtset[comp].evtset, comp); if (rv != PAPI_OK) { IPMDBG("PAPI: [comp %d] Error calling assign_eventset_component\n", comp); }
rv = PAPI_set_multiplex(papi_evtset[comp].evtset); if( rv!= PAPI_OK ) { IPMDBG("PAPI: [comp %d] Error calling set_multiplex\n", comp); }
endif
And it seems to be working and returning presumably "correct" values. Does it look good to you? Have you faced similar results previously?
Thank you very much for your help and for this great tool.
Javi Carretero
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub
Hey Phil, I was actually profiling a run of several seconds and I think PAPI multiplexes by default every 0.1 seconds (100000 microseconds). By the way, I have seen that you guys at minimalmetrics are contributing to PAPIEx, great! Actually, right now I am taking a look at it, trying to install it :)
Hi there,
Believe it or not I'm actually the original author of papiex and Papi many many years ago. I was hoping by now that some smart graduate students would've long rewritten everything but alas that hasn't happened yet. :-)
One can in fact program the multiplexing interval inside of papi, end it will either use that internally (for user space mpx) or send it along to Perf events.
I'm glad you're interested in PAPiex, note that we have a private tree on bit bucket with a number of fixes that I've not been pushed up to github. We are in the process of merging the two so any experiences you have please send them along.
To the IPM authors, sorry about the off-topic response and keep up the good work!
Apologies for brevity and errors as this was sent from my mobile device.
On May 26, 2016, at 19:03, jscarretero notifications@github.com wrote:
Hey Phil, I was actually profiling a run of several seconds and I think PAPI multiplexes by default every 0.1 seconds (100000 microseconds). By the way, I have seen that you guys at minimalmetrics are contributing to PAPIEx, great! Actually, right now I am taking a look at it, trying to install it :)
— You are receiving this because you commented. Reply to this email directly or view it on GitHub
Hi Phil,
I echo Javi's support. Multiplexing is hard but likely a nut we need to crack.
For IPM generally we're interested in job level metrics from long runs on many nodes. There is an opportunity to leverage sampling both across time and potentially across cores.
-David
On Thu, May 26, 2016 at 10:03 AM, jscarretero notifications@github.com wrote:
Hey Phil, I was actually profiling a run of several seconds and I think PAPI multiplexes by default every 0.1 seconds (100000 microseconds). By the way, I have seen that you guys at minimalmetrics are contributing to PAPIEx, great! Actually, right now I am taking a look at it, trying to install it :)
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/nerscadmin/IPM/issues/20#issuecomment-221932383
Dear Admin,
I guess that PAPI event multiplexing is not a very used IPM feature and that it might be experimental. I have been toying with it and have enabled it by uncommenting
/* #define USE_PAPI_MULTIPLEXING */
and by settingMAXNUM_PAPI_EVENTS
to 48,MAXNUM_PAPI_COUNTERS
to 32,MAXSIZE_PAPI_EVTNAME
to 45 andMAXSIZE_ENVKEY
to 2048.When I try to profile a simple test MPI matrix multiplication program (https://github.com/mperlet/matrix_multiplication) having enabled
IPM_HPM=IPM_HPM=UOPS_EXECUTED_PORT:PORT_0,PAPI_TOT_CYC,DTLB_LOAD_MISSES:MISS_CAUSES_A_WALK,AVX:ALL,MEM_UOPS_RETIRED:ALL_LOADS,UOPS_EXECUTED:CORE,PAGE-FAULTS:u=0,PAPI_L3_TCM
(some randomly selected events) I was getting negative numbers forMEM_UOPS_RETIRED:ALL_LOADS and UOPS_EXECUTED:CORE
.I traced to problem to be occurring in
rv = PAPI_set_multiplex(papi_evtset[comp].evtset);
insideipm_papi_start
frommod_papi.c
. I then modified the code embedded in theUSE_PAPI_MULTIPLEXING
define (for theimp_papi_start
function) to be like:And it seems to be working and returning presumably "correct" values. Does it look good to you? Have you faced similar results previously?
Thank you very much for your help and for this great tool.
Javi Carretero