performancecopilot / xsos-pcp

Performance Co-Pilot xsos-inspired script for fast sosreport diagnostics
2 stars 3 forks source link

pcp-xsos does not showing hugepages value correctly if a system is configured with both 2 MB and 1 GB hugepages. #8

Open rajucheerla opened 3 months ago

rajucheerla commented 3 months ago

Hello Team,

I have a system configured with both 2MB and 1 GB hugepages.

# grep -H . /sys/devices/system/node/node*/hugepages/hugepages-*/nr_hugepages
/sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages:10      <<<-- 10 GB
/sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages:1024       <<<--- 2 GB
/sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages:20      <<<--- 20 GB
/sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages:1024        <<<--- 2 GB 

grep ^Huge /proc/meminfo

HugePages_Total: 2048 / 4 GB / HugePages_Free: 2048 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB / Default page size / Hugetlb: 35651584 kB / 34 GB /

The pcp-xsos showing incorrect pool size.

./pcp-xsos -m

  HugePages:
    4 GiB pre-allocated to HugePages (6% of total ram)
    0 GiB of HugePages (0%) in-use by applications

This is because the pcp-xsos tool uses mem.util.hugepagesTotalBytes metric to calculate the value. The HugePages_Total value is depends on the default page size, but in case if the system has multiple page sizes it would be best to track the value of Hugetlb to know the total pool size.

Further, I have looked at the pcp metrics and I don't find any suitable metric to view the value of Hugetlb, but I may be wrong. Is there any way we can fetch Hugetlb value?

Regards, Raju

natoscott commented 2 months ago

@rajucheerla the metrics you seek are below mem.hugepages in the PCP metric namespace, the other PCP metrics that we're using are those from /proc/meminfo which is just the default hugepage size AIUI. Probably the code should be converted to use the newer metrics if they exist, and fallback to the older metrics if not?