Open wabouhamad opened 6 years ago
Actually, the csv data also show a (different) large value: line 582, column 3144:1844674407370955264.00 for pid 54288 and line 588, column 3898: 1844674407370955264.00 for pid 67070.
And I see that last value in the pidstat stdout file:
629584: 1522193470 0 54288 0.00 1844674407370955264.00 0.00 1844674407370955264.00 0 1844674407370955264.00 0.00 114380 3552 0.02 0.00 0.00 0.00 0 1844674407370955264.00 1844674407370955264.00 /opt/cni/bin/openshift-sdn
635438: 1522193530 0 67070 0.00 1844674407370955264.00 0.00 1844674407370955264.00 1 1844674407370955264.00 0.00 188368 3560 0.02 0.00 0.00 0.00 0 1844674407370955264.00 1844674407370955264.00 /opt/cni/bin/openshift-sdn
which likely makes it a pidstat bug (@sjug wins the prize!)
What do I win?
@sjug: you get to fix pidstat.
I'm on it! Time to brush up on my C.
The issue is pidstat is coming up with these massive values? Why are they also different than the result graphical data?
Yes, that is the issue. Not sure what you mean in the second question?
there is something systemically wrong - we've seen this in tools like iostat, too. it could even be a kernel issue.
While running pbench-ansible from a pbench-controller host on an OCP 3.9 cluster, pidstat cpu_usage on one of the compute nodes "svt_node_4:ip-172-31-55-116" shows two processes /opt/cni/bin/openshift-sdn with a huge value for average: 2,937,379,629,571,585.00.
The csv data shows several processes for /opt/cni/bin/openshift-sdn and they to all seem to have zero values.
This appears to be a pidstat postprocessing issue.
The data are on the pbench server in the following directory: