intel / pcm

Intel® Performance Counter Monitor (Intel® PCM)
BSD 3-Clause "New" or "Revised" License
2.82k stars 476 forks source link

Local and Remote Memory Bandwidth (LMB and RMB) metrics are 0 #805

Closed bellalzohir closed 3 months ago

bellalzohir commented 3 months ago
          Hello,

I'm experiencing an issue where Local and Remote Memory Bandwidth (LMB and RMB) metrics are not displayed in Prometheus, despite proper configuration and troubleshooting steps taken.

Steps and observations:

  1. Based on a suggestion, I disabled the Linux RDT driver via RESCTRL with export PCM_USE_RESCTRL=0 and unmounted resctrl to allow PCM direct access to RDT. However, this did not resolve the issue.
  2. I attempted to create a custom monitoring solution that collects only the LMB and RMB using the output from pqos-msr and integrate this data with PCM data in Prometheus. Unfortunately, I encountered issues running PCM server and pqos monitoring simultaneously. The error message indicates that monitoring on core 0 is already started.

I would appreciate any insights or potential solutions

Thank you for your assistance.

Originally posted by @bellalzohir in https://github.com/intel/pcm/issues/761#issuecomment-2254615723

rdementi commented 3 months ago

please see my responses in the original issue