ovis-hpc / ldms

OVIS/LDMS High Performance Computing monitoring, analysis, and visualization project.
https://github.com/ovis-hpc/ovis-wiki/wiki
Other
100 stars 52 forks source link

LDMS support for UNCORE counters #587

Open brandongc opened 3 years ago

brandongc commented 3 years ago

It would be great if the syspapi sampler supported collection of UNCORE counters.

Access to UNCORE counters on Haswell and KNL will allow for collection of memory bandwidth.

These counters are available via perf.

On KNL:

$ perf list
uncore memory:
  unc_e_rpq_inserts
       [mcdram bandwidth read (CPU traffic only) (MB/sec). Unit: uncore_edc_eclk]
  unc_e_wpq_inserts
       [mcdram bandwidth write (CPU traffic only) (MB/sec). Unit: uncore_edc_eclk]
  unc_m_cas_count.rd
       [ddr bandwidth read (CPU traffic only) (MB/sec). Unit: uncore_imc]
  unc_m_cas_count.wr
       [ddr bandwidth write (CPU traffic only) (MB/sec). Unit: uncore_imc]
$ perf stat -e unc_e_rpq_inserts,unc_e_wpq_inserts,unc_m_cas_count.rd,unc_m_cas_count.wr -- dd if=/dev/zero of=/dev/null count=100000
100000+0 records in
100000+0 records out
51200000 bytes (51 MB, 49 MiB) copied, 0.381818 s, 134 MB/s

 Performance counter stats for 'system wide':

             38.97 MiB  unc_e_rpq_inserts
              5.03 MiB  unc_e_wpq_inserts
             28.47 MiB  unc_m_cas_count.rd
              0.03 MiB  unc_m_cas_count.wr

       0.392743527 seconds time elapsed

on Haswell

$ perf list
uncore memory:
  llc_misses.mem_read
       [read requests to memory controller. Derived from unc_m_cas_count.rd. Unit: uncore_imc]
  llc_misses.mem_write
       [write requests to memory controller. Derived from unc_m_cas_count.wr. Unit: uncore_imc]
  unc_m_clockticks
       [Memory controller clock ticks. Unit: uncore_imc]
  unc_m_power_channel_ppd
       [Cycles where DRAM ranks are in power down (CKE) mode. Unit: uncore_imc]
  unc_m_power_critical_throttle_cycles
       [Cycles all ranks are in critical thermal throttle. Unit: uncore_imc]
  unc_m_power_self_refresh
       [Cycles Memory is in self refresh power mode. Unit: uncore_imc]
  unc_m_pre_count.page_miss
       [Pre-charges due to page misses. Unit: uncore_imc]
  unc_m_pre_count.rd
       [Pre-charge for reads. Unit: uncore_imc]
  unc_m_pre_count.wr
       [Pre-charge for writes. Unit: uncore_imc]
$ perf stat -e llc_misses.mem_read,llc_misses.mem_write -- dd if=/dev/zero of=/dev/null count=100000
100000+0 records in
100000+0 records out
51200000 bytes (51 MB, 49 MiB) copied, 0.0988937 s, 518 MB/s

 Performance counter stats for 'system wide':

           1515968 Bytes llc_misses.mem_read
           1445312 Bytes llc_misses.mem_write

       0.100530858 seconds time elapsed
brandongc commented 3 years ago

@tom95858 From what I understand you already talked with Eric about this? I didn't see a public issue with the request so apologies if there is already one somewhere.

tom95858 commented 3 years ago

@brandongc, I don't think I have. We would need to extend the namespace that we use to map the event names used to configure. Right now they are limited to the ones supported by PAPI.

brandongc commented 3 years ago

The papi native equivalents for those perf events are:

KNL

knl_unc_imc[0-5]::UNC_M_CAS_COUNT:[RD|WR]
knl_unc_edc_eclk[0-7]::UNC_E_[RPQ|WPQ]_INSERTS

Haswell

hswep_unc_imc[0-7]::UNC_M_CAS_COUNT:[RD|WR]
tom95858 commented 3 years ago

Hi Brandon,

Then I am confused. The syspapi sampler uses this:

rc = PAPI_event_name_to_code((char*)papi_name, &papi_code);

to convert PAPI names to PFM event codes. Can you please give me specific information about what is not working? If there is a PAPI event name, then this should work.

Thanks, Tom

On Tue, Jan 12, 2021 at 11:15 AM brandongc notifications@github.com wrote:

The papi native equivalents for those perf events are:

KNL

knl_unc_imc[0-5]::UNC_M_CAS_COUNT:[RD|WR] knl_unc_edc_eclk[0-7]::UNCE[RPQ|WPQ]_INSERTS

Haswell

hswep_unc_imc[0-7]::UNC_M_CAS_COUNT:[RD|WR]

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ovis-hpc/ovis/issues/587#issuecomment-758843736, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVTPXH2IHVKICS7LJWMZETSZSGSLANCNFSM4V25OEZQ .

-- Thomas Tucker, President, Open Grid Computing, Inc.

baallan commented 3 years ago

@tom95858 Seems like all the logic around events in syspapi (after the name lookup) is looped over cores: for (i = 0; i < NCPU; i++) so an uncore related name should make things fall apart in some way, right?

brandongc commented 3 years ago

Sorry, I don't have any logs/ errors from when we tried this at the moment (it was some time ago).

I think the issue is with the assumption that all events are per core.

tom95858 commented 3 years ago

Hi Brandon,

Ok, thanks for root causing this. I'll take a look.

Tom

On Thu, Jan 14, 2021 at 12:09 PM brandongc notifications@github.com wrote:

Sorry, I don't have any logs/ errors from when we tried this at the moment (it was some time ago).

I think the issue is with the assumption that all events are per core.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ovis-hpc/ovis/issues/587#issuecomment-760407632, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVTPXFEJ2JDR7GXHYVFHQDSZ46NHANCNFSM4V25OEZQ .

-- Thomas Tucker, President, Open Grid Computing, Inc.

tom95858 commented 3 years ago

There are at least three issues with this:

  1. syspapi checks the component and if it is not "perf_event", it will issue an error for the event
  2. syspapi assumes that all events are per-core, i.e. the data returned when the event is sampled is an NCPU array of values.
  3. syspapi creates an NCPU deep array in the metric set for each event

The 1st issue is easily fixable by simply changing the component check to both perf_event and perf_event_uncore. The 2nd issue requires checking the event component when it is sampled and interpreting the data appropriately The 3rd issue requires changes to both the syspapi sampler as well as the syspapi_store.

Is uncore support required for the PAPI sampler as well?

Thanks, Tom

On Thu, Jan 14, 2021 at 12:19 PM Thomas Tucker tom@ogc.us wrote:

Hi Brandon,

Ok, thanks for root causing this. I'll take a look.

Tom

On Thu, Jan 14, 2021 at 12:09 PM brandongc notifications@github.com wrote:

Sorry, I don't have any logs/ errors from when we tried this at the moment (it was some time ago).

I think the issue is with the assumption that all events are per core.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ovis-hpc/ovis/issues/587#issuecomment-760407632, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVTPXFEJ2JDR7GXHYVFHQDSZ46NHANCNFSM4V25OEZQ .

-- Thomas Tucker, President, Open Grid Computing, Inc.

-- Thomas Tucker, President, Open Grid Computing, Inc.

brandongc commented 3 years ago

It would be great to have this in the PAPI sampler as well, but initially we could proceed with just syspapi.

brandongc commented 3 years ago

Hi Tom, No pressure, but for our own planning is there any time frame for this? Or something else we can provide to help?

Thanks, Brandon

tom95858 commented 3 years ago

Hi Brandon,

It's on my list. Pressure let's me know what's important to users so no worries 👍

On Thu, Feb 4, 2021, 1:40 PM brandongc notifications@github.com wrote:

Hi Tom, No pressure, but for our own planning is there any time frame for this? Or something else we can provide to help?

Thanks, Brandon

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ovis-hpc/ovis/issues/587#issuecomment-773590235, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVTPXGQTSFEGEFGLUCSXE3S5MA27ANCNFSM4V25OEZQ .

tom95858 commented 3 years ago

@brandongc could you please provide me with an example uncore counter you would like to handle with its associated PAPI name?

brandongc commented 3 years ago

@tom95858 The most immediately useful would be the memory controller events. In particular on a Haswell I would like to get the UNC_M_CAS_COUNT for each memory controller in order to calculate the memory bandwidth:

cookbg@nid00388:~> papi_native_avail | grep UNC_M_CAS_COUNT
| hswep_unc_imc0::UNC_M_CAS_COUNT                                              |
| hswep_unc_imc1::UNC_M_CAS_COUNT                                              |
| hswep_unc_imc2::UNC_M_CAS_COUNT                                              |
| hswep_unc_imc3::UNC_M_CAS_COUNT                                              |
| hswep_unc_imc4::UNC_M_CAS_COUNT                                              |
| hswep_unc_imc5::UNC_M_CAS_COUNT                                              |
| hswep_unc_imc6::UNC_M_CAS_COUNT                                              |
| hswep_unc_imc7::UNC_M_CAS_COUNT                                              |
cookbg@nid00388:~> papi_native_avail -e hswep_unc_imc0::UNC_M_CAS_COUNT
Available native events and hardware information.
--------------------------------------------------------------------------------
PAPI version             : 5.7.0.2
Operating system         : Linux 4.12.14-150.17_5.0.93-cray_ari_c
Vendor string and code   : GenuineIntel (1, 0x1)
Model string and code    : Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz (63, 0x3f)
CPU revision             : 2.000000
CPUID                    : Family/Model/Stepping 6/63/2, 0x06/0x3f/0x02
CPU Max MHz              : 2301
CPU Min MHz              : 1200
Total cores              : 64
SMT threads per core     : 2
Cores per socket         : 16
Sockets                  : 2
Cores per NUMA region    : 32
NUMA regions             : 2
Running in a VM          : no
Number Hardware Counters : 11
Max Multiplex Counters   : 384
Fast counter read (rdpmc): no
--------------------------------------------------------------------------------

Event name:     hswep_unc_imc0::UNC_M_CAS_COUNT
Description:    DRAM RD_CAS and WR_CAS Commands.

Qualifiers:         Name -- Description
      Info:         :ALL -- Counts total number of DRAM CAS commands issued on this channel
      Info:          :RD -- Counts all DRAM reads on this channel, incl. underfills
      Info:      :RD_REG -- Counts number of DRAM read CAS commands issued on this channel, incl. regular read CAS and those with implicit precharge
      Info:   :RD_UNDERFILL -- Counts number of underfill reads issued by the memory controller
      Info:          :WR -- Counts number of DRAM write CAS commands on this channel
      Info:      :WR_RMM -- Counts Number of opportunistic DRAM write CAS commands issued on this channel
      Info:      :WR_WMM -- Counts number of DRAM write CAS commands issued on this channel while in Write-Major mode
      Info:      :RD_RMM -- Counts Number of opportunistic DRAM read CAS commands issued on this channel
      Info:      :RD_WMM -- Counts number of DRAM read CAS commands issued on this channel while in Write-Major mode
      Info:         :e=0 -- edge detect
      Info:         :i=0 -- invert
      Info:         :t=0 -- threshold in range [0-255]
      Info:    :period=0 -- sampling period
      Info:      :freq=0 -- sampling frequency (Hz)
      Info:      :excl=0 -- exclusive access
      Info:       :cpu=0 -- CPU to program
      Info:    :pinned=0 -- pin event to counters

On KNL, for same analysis (also including the on package MCDRAM) I would like to get

knl_unc_imc[0-5]::UNC_M_CAS_COUNT:[RD|WR]
knl_unc_edc_eclk[0-7]::UNC_E_[RPQ|WPQ]_INSERTS
brandongc commented 3 years ago

@tom95858 Is there anything more you need from my side to help with this?

tom95858 commented 3 years ago

Hi Brandon,

We just need time to get to it. We've been buried with other priorities.

Tom

On Wed, Aug 4, 2021 at 8:52 PM brandongc @.***> wrote:

@tom95858 https://github.com/tom95858 Is there anything more you need from my side to help with this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ovis-hpc/ovis/issues/587#issuecomment-893078572, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVTPXFCAHFTU7KOQ62BC7DT3HOD7ANCNFSM4V25OEZQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

-- Thomas Tucker, President, Open Grid Computing, Inc.