Open mjtrangoni opened 6 years ago
As I wrote in #873, an option/fix would be iterating over the /sys/devices/system/cpu/present
file. See,
grep . /sys/devices/system/cpu/{online,offline,possible,present}
/sys/devices/system/cpu/online:0-1,8-9,16-17,24-25,32-33,40-41,48-49,56-57,64-65,72-73,80-81,88-89,96-97,104-105,112-113,120-121,128-129,136-137,144-145,152-153
/sys/devices/system/cpu/offline:2-7,10-15,18-23,26-31,34-39,42-47,50-55,58-63,66-71,74-79,82-87,90-95,98-103,106-111,114-119,122-127,130-135,138-143,146-151,154-159
/sys/devices/system/cpu/possible:0-159
/sys/devices/system/cpu/present:0-159
Thanks for the information @mjtrangoni. I took the liberty to rename the issue a bit. There is no support for offline CPUs availble in this procfs library at the moment. Furthermore, the node_exporter doesn't currently use this procfs library for CPU metrics, but this is something we should change. Your /proc/stat
file will be of great help to write a test.
Hi @grobie,
I found this looking at node_exporter's collector/cpu_linux.go
, at line 199, when it calls the updateStat
function.
Iterating over the present
file will give you the total amount of present CPUs. At the /proc/stat
file you see only the online ones. See also this
Thanks for the clarification @mjtrangoni, my bad, I had missed that the node_exporter uses procfs by now.
In either case, I don't think that the procfs library does anything wrong here, it's meant to be a library to access procfs information from go. It would be the node_exporter's responsibility to combine the information from /sys/devices/system/cpu/present
and /proc/stat
. We should support parsing the /sys/devices/system/cpu/*
information in the sysfs package though.
What I think is wrong or not 100% right is that, in my case, all intermediate metrics of offline CPUs are initialized to 0, while the ones from cpu154 onwards are not. I have to double-check that!
And I really like the idea of exposing the /sys/devices/system/cpu/*
information, but I am not sure how much of them. I can make a PR exporting the /sys/devices/system/cpu/{online,offline,possible,present}
information as a first-implementation.
You're right, we're currently exposing an array with zero value CPUStat
types for offline CPUs. It was an oversight from my side (I wasn't thinking of offline CPUs). I don't believe this library should make up any information which are not actually exposed by the procfs. So we should change the returned data type. Either a map[uint]CPUStat
would work (but unordered in golang, not so great) or we add a Processor uint
attribute to the CPUStat.
Hi @rtreffer @SuperQ , This issue is related to #873.
When parsing the
/proc/stat
file, I am missing the latest offline CPUs bunch of metrics (from cpu154 to cpu159). As @brian-brazil said, the CPU metrics should always be there.See:
Summarizing, every CPU metrics of an offline CPU until cpu151 are zero, while the last bunch cpu{154..159} are missing completely .