Open brandongc opened 3 years ago
@tom95858 From what I understand you already talked with Eric about this? I didn't see a public issue with the request so apologies if there is already one somewhere.
@brandongc, I don't think I have. We would need to extend the namespace that we use to map the event names used to configure. Right now they are limited to the ones supported by PAPI.
The papi native equivalents for those perf events are:
KNL
knl_unc_imc[0-5]::UNC_M_CAS_COUNT:[RD|WR]
knl_unc_edc_eclk[0-7]::UNC_E_[RPQ|WPQ]_INSERTS
Haswell
hswep_unc_imc[0-7]::UNC_M_CAS_COUNT:[RD|WR]
Hi Brandon,
Then I am confused. The syspapi sampler uses this:
rc = PAPI_event_name_to_code((char*)papi_name, &papi_code);
to convert PAPI names to PFM event codes. Can you please give me specific information about what is not working? If there is a PAPI event name, then this should work.
Thanks, Tom
On Tue, Jan 12, 2021 at 11:15 AM brandongc notifications@github.com wrote:
The papi native equivalents for those perf events are:
KNL
knl_unc_imc[0-5]::UNC_M_CAS_COUNT:[RD|WR] knl_unc_edc_eclk[0-7]::UNCE[RPQ|WPQ]_INSERTS
Haswell
hswep_unc_imc[0-7]::UNC_M_CAS_COUNT:[RD|WR]
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ovis-hpc/ovis/issues/587#issuecomment-758843736, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVTPXH2IHVKICS7LJWMZETSZSGSLANCNFSM4V25OEZQ .
-- Thomas Tucker, President, Open Grid Computing, Inc.
@tom95858 Seems like all the logic around events in syspapi (after the name lookup) is looped over cores: for (i = 0; i < NCPU; i++) so an uncore related name should make things fall apart in some way, right?
Sorry, I don't have any logs/ errors from when we tried this at the moment (it was some time ago).
I think the issue is with the assumption that all events are per core.
Hi Brandon,
Ok, thanks for root causing this. I'll take a look.
Tom
On Thu, Jan 14, 2021 at 12:09 PM brandongc notifications@github.com wrote:
Sorry, I don't have any logs/ errors from when we tried this at the moment (it was some time ago).
I think the issue is with the assumption that all events are per core.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ovis-hpc/ovis/issues/587#issuecomment-760407632, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVTPXFEJ2JDR7GXHYVFHQDSZ46NHANCNFSM4V25OEZQ .
-- Thomas Tucker, President, Open Grid Computing, Inc.
There are at least three issues with this:
The 1st issue is easily fixable by simply changing the component check to both perf_event and perf_event_uncore. The 2nd issue requires checking the event component when it is sampled and interpreting the data appropriately The 3rd issue requires changes to both the syspapi sampler as well as the syspapi_store.
Is uncore support required for the PAPI sampler as well?
Thanks, Tom
On Thu, Jan 14, 2021 at 12:19 PM Thomas Tucker tom@ogc.us wrote:
Hi Brandon,
Ok, thanks for root causing this. I'll take a look.
Tom
On Thu, Jan 14, 2021 at 12:09 PM brandongc notifications@github.com wrote:
Sorry, I don't have any logs/ errors from when we tried this at the moment (it was some time ago).
I think the issue is with the assumption that all events are per core.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ovis-hpc/ovis/issues/587#issuecomment-760407632, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVTPXFEJ2JDR7GXHYVFHQDSZ46NHANCNFSM4V25OEZQ .
-- Thomas Tucker, President, Open Grid Computing, Inc.
-- Thomas Tucker, President, Open Grid Computing, Inc.
It would be great to have this in the PAPI sampler as well, but initially we could proceed with just syspapi.
Hi Tom, No pressure, but for our own planning is there any time frame for this? Or something else we can provide to help?
Thanks, Brandon
Hi Brandon,
It's on my list. Pressure let's me know what's important to users so no worries 👍
On Thu, Feb 4, 2021, 1:40 PM brandongc notifications@github.com wrote:
Hi Tom, No pressure, but for our own planning is there any time frame for this? Or something else we can provide to help?
Thanks, Brandon
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ovis-hpc/ovis/issues/587#issuecomment-773590235, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVTPXGQTSFEGEFGLUCSXE3S5MA27ANCNFSM4V25OEZQ .
@brandongc could you please provide me with an example uncore counter you would like to handle with its associated PAPI name?
@tom95858 The most immediately useful would be the memory controller events. In particular on a Haswell I would like to get the UNC_M_CAS_COUNT for each memory controller in order to calculate the memory bandwidth:
cookbg@nid00388:~> papi_native_avail | grep UNC_M_CAS_COUNT
| hswep_unc_imc0::UNC_M_CAS_COUNT |
| hswep_unc_imc1::UNC_M_CAS_COUNT |
| hswep_unc_imc2::UNC_M_CAS_COUNT |
| hswep_unc_imc3::UNC_M_CAS_COUNT |
| hswep_unc_imc4::UNC_M_CAS_COUNT |
| hswep_unc_imc5::UNC_M_CAS_COUNT |
| hswep_unc_imc6::UNC_M_CAS_COUNT |
| hswep_unc_imc7::UNC_M_CAS_COUNT |
cookbg@nid00388:~> papi_native_avail -e hswep_unc_imc0::UNC_M_CAS_COUNT
Available native events and hardware information.
--------------------------------------------------------------------------------
PAPI version : 5.7.0.2
Operating system : Linux 4.12.14-150.17_5.0.93-cray_ari_c
Vendor string and code : GenuineIntel (1, 0x1)
Model string and code : Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz (63, 0x3f)
CPU revision : 2.000000
CPUID : Family/Model/Stepping 6/63/2, 0x06/0x3f/0x02
CPU Max MHz : 2301
CPU Min MHz : 1200
Total cores : 64
SMT threads per core : 2
Cores per socket : 16
Sockets : 2
Cores per NUMA region : 32
NUMA regions : 2
Running in a VM : no
Number Hardware Counters : 11
Max Multiplex Counters : 384
Fast counter read (rdpmc): no
--------------------------------------------------------------------------------
Event name: hswep_unc_imc0::UNC_M_CAS_COUNT
Description: DRAM RD_CAS and WR_CAS Commands.
Qualifiers: Name -- Description
Info: :ALL -- Counts total number of DRAM CAS commands issued on this channel
Info: :RD -- Counts all DRAM reads on this channel, incl. underfills
Info: :RD_REG -- Counts number of DRAM read CAS commands issued on this channel, incl. regular read CAS and those with implicit precharge
Info: :RD_UNDERFILL -- Counts number of underfill reads issued by the memory controller
Info: :WR -- Counts number of DRAM write CAS commands on this channel
Info: :WR_RMM -- Counts Number of opportunistic DRAM write CAS commands issued on this channel
Info: :WR_WMM -- Counts number of DRAM write CAS commands issued on this channel while in Write-Major mode
Info: :RD_RMM -- Counts Number of opportunistic DRAM read CAS commands issued on this channel
Info: :RD_WMM -- Counts number of DRAM read CAS commands issued on this channel while in Write-Major mode
Info: :e=0 -- edge detect
Info: :i=0 -- invert
Info: :t=0 -- threshold in range [0-255]
Info: :period=0 -- sampling period
Info: :freq=0 -- sampling frequency (Hz)
Info: :excl=0 -- exclusive access
Info: :cpu=0 -- CPU to program
Info: :pinned=0 -- pin event to counters
On KNL, for same analysis (also including the on package MCDRAM) I would like to get
knl_unc_imc[0-5]::UNC_M_CAS_COUNT:[RD|WR]
knl_unc_edc_eclk[0-7]::UNC_E_[RPQ|WPQ]_INSERTS
@tom95858 Is there anything more you need from my side to help with this?
Hi Brandon,
We just need time to get to it. We've been buried with other priorities.
Tom
On Wed, Aug 4, 2021 at 8:52 PM brandongc @.***> wrote:
@tom95858 https://github.com/tom95858 Is there anything more you need from my side to help with this?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ovis-hpc/ovis/issues/587#issuecomment-893078572, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVTPXFCAHFTU7KOQ62BC7DT3HOD7ANCNFSM4V25OEZQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .
-- Thomas Tucker, President, Open Grid Computing, Inc.
It would be great if the syspapi sampler supported collection of UNCORE counters.
Access to UNCORE counters on Haswell and KNL will allow for collection of memory bandwidth.
These counters are available via perf.
On KNL:
on Haswell