I'm experiencing an issue where Local and Remote Memory Bandwidth (LMB and RMB) metrics are not displayed in Prometheus, despite proper configuration and troubleshooting steps taken.
Steps and observations:
Based on a suggestion, I disabled the Linux RDT driver via RESCTRL with export PCM_USE_RESCTRL=0 and unmounted resctrl to allow PCM direct access to RDT. However, this did not resolve the issue.
I attempted to create a custom monitoring solution that collects only the LMB and RMB using the output from pqos-msr and integrate this data with PCM data in Prometheus. Unfortunately, I encountered issues running PCM server and pqos monitoring simultaneously. The error message indicates that monitoring on core 0 is already started.
I would appreciate any insights or potential solutions
I'm experiencing an issue where Local and Remote Memory Bandwidth (LMB and RMB) metrics are not displayed in Prometheus, despite proper configuration and troubleshooting steps taken.
Steps and observations:
export PCM_USE_RESCTRL=0
and unmounted resctrl to allow PCM direct access to RDT. However, this did not resolve the issue.pqos-msr
and integrate this data with PCM data in Prometheus. Unfortunately, I encountered issues running PCM server and pqos monitoring simultaneously. The error message indicates that monitoring on core 0 is already started.I would appreciate any insights or potential solutions
Thank you for your assistance.
Originally posted by @bellalzohir in https://github.com/intel/pcm/issues/761#issuecomment-2254615723