giampaolo / psutil

Cross-platform lib for process and system monitoring in Python
BSD 3-Clause "New" or "Revised" License
10.25k stars 1.38k forks source link

[Ubuntu 16.04.6 LTS] cpu_times_percent wrong idle values when percpu=True #1586

Open cyril23 opened 5 years ago

cyril23 commented 5 years ago

Platform

Bug description

psutil.cpu_times_percent(interval=1, percpu=True) retrieves the wrong idle value on my machine

Examples

Example 1 (BUG):

import psutil
psutil.cpu_times_percent(interval=1, percpu=True)

Results - Example 1

[scputimes(user=2.0, nice=0.0, system=1.0, idle=64.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=1.0, nice=0.0, system=1.0, idle=64.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=9.0, nice=0.0, system=0.0, idle=56.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=67.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=3.0, nice=0.0, system=2.0, idle=62.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=66.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

Example 2 (BUG):

import psutil
psutil.cpu_times_percent(interval=0.5, percpu=True)

Results - Example 2

[scputimes(user=1.0, nice=0.0, system=0.0, idle=33.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=34.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=33.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=33.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=33.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=33.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

Example 3 (BUG):

import psutil
psutil.cpu_times_percent(interval=0.25, percpu=True)

Results - Example 3

[scputimes(user=0.0, nice=0.0, system=0.0, idle=17.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=17.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=17.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=17.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=17.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=17.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

Results - Example 4

[scputimes(user=0.0, nice=0.0, system=0.0, idle=8.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=8.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=8.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=8.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=8.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=8.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

Example 5 (BUG):

import psutil
psutil.cpu_times_percent(interval=1.1, percpu=True)

Results - Example 5

[scputimes(user=5.0, nice=0.0, system=1.0, idle=69.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=73.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=73.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=73.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=73.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=73.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

Results - Example 6

[scputimes(user=0.0, nice=0.0, system=0.0, idle=100.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=100.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=100.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.8, nice=0.0, system=0.0, idle=99.2, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=100.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0), 
 scputimes(user=0.0, nice=0.0, system=0.0, idle=100.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

Other Examples and observations:

intervals

giampaolo commented 5 years ago

Why do you think it's the wrong idle value?

cyril23 commented 5 years ago

Please read my description again.

giampaolo commented 5 years ago

Ah, you're right, I see the discrepancy now. It seems your namedtuple does not return guest and guest_nice metrics for some reason. Do you have a virtual machine running? (virtualbox, vmware, ...) Can you paste the output of cat /proc/stat?

cyril23 commented 5 years ago

Yes, it is a VM. I don't know what's the hypervisor though. I can ask for it.

cat /proc/stat
cpu  2011087 1368 1149652 328633032 151302 0 0 0
cpu0 699181 602 281740 54261103 68903 0 0 0
cpu1 382053 102 196442 54723566 20582 0 0 0
cpu2 277955 63 198209 54836257 14951 0 0 0
cpu3 261998 212 180058 54865613 18765 0 0 0
cpu4 207652 9 153617 54954250 13111 0 0 0
cpu5 182245 377 139583 54992241 14986 0 0 0
intr 0
swap 0 0

ctxt 332233131
btime 1567824132
processes 2370944
procs_running 2
procs_blocked 0
cyril23 commented 5 years ago

Some information I found on https://www.ostechnix.com/check-linux-system-physical-virtual-machine/

sudo lshw -class system
vserver
    description: Computer
    width: 64 bits
    capabilities: vsyscall64 vsyscall32
systemd-detect-virt
openvz
sudo virt-what
openvz
giampaolo commented 5 years ago

OK, as I thought /proc/stat is missing 2 extra columns at the end: guest and guest_nice. Those 2 represent the CPU time spent running "guest" / virtualized systems or containers like openvz.

As per this code section: https://github.com/giampaolo/psutil/blob/c10df5aa04e1ced58d19501fa42f08c1b909b83d/psutil/_pslinux.py#L281-L286 ...they should be available only on Linux > 2.6.24 and Linux > 3.2.0 respectively. What Linux kernel are you on?

As per this comment: https://github.com/giampaolo/psutil/blob/c10df5aa04e1ced58d19501fa42f08c1b909b83d/psutil/__init__.py#L1756-L1765 ...guest and guest_nice are accounted also in user and nice times. But since they are not available and the idle time is off, my best guess is that the kernel does not keep track of them at all, including in user and nice counters, which are lower than they should be (and that's why idle is off, it's missing some "busy time"). In summary, to me it appears like a kernel bug, and you'll probably see the same discrepancy by using top.

cyril23 commented 5 years ago

.they should be available only on Linux > 2.6.24 and Linux > 3.2.0 respectively. What Linux kernel are you on?

Kernel:

uname -r
   2.6.32-042stab140.1

uname -a
   Linux vserver 2.6.32-042stab140.1 #1 SMP Thu Aug 15 13:32:22 MSK 2019 x86_64 x86_64 x86_64 GNU/Linux

OS:

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.6 LTS
Release:        16.04
Codename:       xenial

Using top: top

Looks like top works, so I don't understand why psutil.cpu_times_percent can't get it right.

giampaolo commented 5 years ago

Looks like top works

Can you run a CPU intensive task while you do top? Something like this:

while 1:
    pass
cyril23 commented 5 years ago

while 1: pass didn't seem to work. It just showed me a command line, but doesn't show any load on the CPU.

Executing this however gave me a big CPU utilization: openssl speed -multi $(grep -ci processor /proc/cpuinfo)

last

giampaolo commented 5 years ago

OK, while you run that (CPU instensive task) also run this:

import psutil
from pprint import pprint as pp
for x in range(5):
    pp(psutil.cpu_times(percpu=True))
    print()
    pp(psutil.cpu_times_percent(interval=1, percpu=True))
    print("\n-----\n")

What does it print?

cyril23 commented 5 years ago

It printed this:

python
Python 2.7.12 (default, Aug 22 2019, 16:36:40)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import psutil
>>> from pprint import pprint as pp
>>> for x in range(5):
...   pp(psutil.cpu_times(percpu=True))
...   print()
...   pp(psutil.cpu_times_percent(interval=1, percpu=True))
...   print("\n-----\n")
...
[scputimes(user=7097.23, nice=6.02, system=2848.79, idle=550372.05, iowait=689.2, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=3916.15, nice=1.02, system=1988.13, idle=555015.14, iowait=205.84, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2856.91, nice=0.63, system=2001.38, idle=556165.27, iowait=149.54, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2706.13, nice=2.12, system=1819.65, idle=556450.01, iowait=187.65, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2148.57, nice=0.09, system=1556.22, idle=557349.86, iowait=131.11, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=1892.58, nice=3.77, system=1410.94, idle=557736.44, iowait=149.97, irq=0.0, softirq=0.0, steal=0.0)]
()
[scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=66.0, nice=0.0, system=0.0, idle=1.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

-----

[scputimes(user=7097.9, nice=6.02, system=2848.79, idle=550372.05, iowait=689.2, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=3916.82, nice=1.02, system=1988.13, idle=555015.14, iowait=205.84, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2857.58, nice=0.63, system=2001.38, idle=556165.27, iowait=149.54, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2706.8, nice=2.12, system=1819.65, idle=556450.01, iowait=187.65, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2149.23, nice=0.09, system=1556.22, idle=557349.87, iowait=131.11, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=1893.25, nice=3.77, system=1410.94, idle=557736.44, iowait=149.97, irq=0.0, softirq=0.0, steal=0.0)]
()
[scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=68.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=1.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

-----

[scputimes(user=7098.57, nice=6.02, system=2848.79, idle=550372.05, iowait=689.2, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=3917.5, nice=1.02, system=1988.13, idle=555015.14, iowait=205.84, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2858.25, nice=0.63, system=2001.38, idle=556165.27, iowait=149.54, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2707.47, nice=2.12, system=1819.65, idle=556450.01, iowait=187.65, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2149.9, nice=0.09, system=1556.22, idle=557349.88, iowait=131.11, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=1893.92, nice=3.77, system=1410.94, idle=557736.44, iowait=149.97, irq=0.0, softirq=0.0, steal=0.0)]
()
[scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=66.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=66.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

-----

[scputimes(user=7099.24, nice=6.02, system=2848.79, idle=550372.05, iowait=689.2, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=3918.16, nice=1.02, system=1988.13, idle=555015.14, iowait=205.84, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2858.92, nice=0.63, system=2001.38, idle=556165.27, iowait=149.54, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2708.14, nice=2.12, system=1819.65, idle=556450.01, iowait=187.65, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2150.56, nice=0.09, system=1556.22, idle=557349.88, iowait=131.11, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=1894.59, nice=3.77, system=1410.94, idle=557736.44, iowait=149.97, irq=0.0, softirq=0.0, steal=0.0)]
()
[scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=66.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

-----

[scputimes(user=7099.91, nice=6.02, system=2848.79, idle=550372.05, iowait=689.2, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=3918.83, nice=1.02, system=1988.13, idle=555015.14, iowait=205.84, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2859.59, nice=0.63, system=2001.38, idle=556165.27, iowait=149.54, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2708.81, nice=2.12, system=1819.65, idle=556450.01, iowait=187.65, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2151.23, nice=0.09, system=1556.22, idle=557349.88, iowait=131.11, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=1895.25, nice=3.77, system=1410.94, idle=557736.44, iowait=149.97, irq=0.0, softirq=0.0, steal=0.0)]
()
[scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=66.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=66.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=66.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=67.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

-----

>>>

edit: fixed output (forgot the first 4 outputs) edit2: fixed formatting (deleted spaces)

cyril23 commented 5 years ago

Here is the output of the same test, using an interval=2:

python
Python 2.7.12 (default, Aug 22 2019, 16:36:40)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import psutil
>>> from pprint import pprint as pp
>>> for x in range(5):
...   pp(psutil.cpu_times(percpu=True))
...   print()
...   pp(psutil.cpu_times_percent(interval=2, percpu=True))
...   print("\n-----\n")
...
[scputimes(user=7225.65, nice=6.02, system=2903.09, idle=562268.14, iowait=714.89, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=4027.61, nice=1.02, system=2032.54, idle=566956.38, iowait=216.05, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2949.96, nice=0.63, system=2052.25, idle=568115.16, iowait=163.39, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2796.31, nice=2.12, system=1867.77, idle=568401.19, iowait=205.45, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2227.1, nice=0.09, system=1596.62, idle=569321.05, iowait=148.37, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=1968.06, nice=3.77, system=1448.18, idle=569714.85, iowait=166.52, irq=0.0, softirq=0.0, steal=0.0)]
()
[scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=40.3, nice=0.0, system=0.0, idle=59.7, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

-----

[scputimes(user=7226.98, nice=6.02, system=2903.09, idle=562268.14, iowait=714.89, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=4028.94, nice=1.02, system=2032.54, idle=566956.38, iowait=216.05, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2951.29, nice=0.63, system=2052.25, idle=568115.16, iowait=163.39, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2796.85, nice=2.12, system=1867.77, idle=568401.99, iowait=205.45, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2228.43, nice=0.09, system=1596.62, idle=569321.05, iowait=148.37, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=1969.39, nice=3.77, system=1448.18, idle=569714.85, iowait=166.52, irq=0.0, softirq=0.0, steal=0.0)]
()
[scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=97.8, nice=0.0, system=0.0, idle=2.2, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

-----

[scputimes(user=7228.31, nice=6.02, system=2903.09, idle=562268.14, iowait=714.89, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=4030.28, nice=1.02, system=2032.54, idle=566956.38, iowait=216.05, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2952.62, nice=0.63, system=2052.25, idle=568115.16, iowait=163.39, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2798.18, nice=2.12, system=1867.77, idle=568401.99, iowait=205.45, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2229.74, nice=0.09, system=1596.62, idle=569321.08, iowait=148.37, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=1970.74, nice=3.77, system=1448.18, idle=569714.85, iowait=166.52, irq=0.0, softirq=0.0, steal=0.0)]
()
[scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=98.5, nice=0.0, system=0.0, idle=1.5, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=99.3, nice=0.0, system=0.7, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

-----

[scputimes(user=7229.65, nice=6.02, system=2903.09, idle=562268.14, iowait=714.89, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=4031.62, nice=1.02, system=2032.54, idle=566956.38, iowait=216.05, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2953.96, nice=0.63, system=2052.25, idle=568115.16, iowait=163.39, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2799.52, nice=2.12, system=1867.77, idle=568401.99, iowait=205.45, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2231.05, nice=0.09, system=1596.62, idle=569321.1, iowait=148.37, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=1972.1, nice=3.77, system=1448.19, idle=569714.85, iowait=166.52, irq=0.0, softirq=0.0, steal=0.0)]
()
[scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=99.3, nice=0.0, system=0.0, idle=0.7, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

-----

[scputimes(user=7230.99, nice=6.02, system=2903.09, idle=562268.14, iowait=714.89, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=4032.95, nice=1.02, system=2032.54, idle=566956.38, iowait=216.05, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2955.3, nice=0.63, system=2052.25, idle=568115.16, iowait=163.39, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2800.86, nice=2.12, system=1867.77, idle=568401.99, iowait=205.45, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=2232.38, nice=0.09, system=1596.62, idle=569321.11, iowait=148.37, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=1973.44, nice=3.77, system=1448.19, idle=569714.85, iowait=166.52, irq=0.0, softirq=0.0, steal=0.0)]
()
[scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0),
 scputimes(user=100.0, nice=0.0, system=0.0, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0)]

-----

>>>
ulxasc commented 3 years ago

I have run into this problem as well, it looks like it is an issue with intervals that are smaller than 1 second. https://github.com/giampaolo/psutil/blob/5a76cfadcabf37f65f0d70e4c5b7493478b53c2c/psutil/__init__.py#L1800-L1804 What is the reason we cannot do fractions here? Calling cpu_times_percent with an interval less than about 1 second tends to return percentages that don't add up to 100% for each cpu. If you do:

scale = 100.0 / all_delta if all_delta > 0 else 100.0

wouldn't it solve that problem?

frankkusters commented 5 months ago

I have run into this problem as well, it looks like it is an issue with intervals that are smaller than 1 second.

https://github.com/giampaolo/psutil/blob/5a76cfadcabf37f65f0d70e4c5b7493478b53c2c/psutil/__init__.py#L1800-L1804

What is the reason we cannot do fractions here? Calling cpu_times_percent with an interval less than about 1 second tends to return percentages that don't add up to 100% for each cpu.

I ran into this problem, as I'm trying to monitor at sub-second intervals. I modified the code to print all_delta just before line 1804.

With this script:

import psutil
import time
psutil.cpu_times_percent(percpu=True)
time.sleep(1)
psutil.cpu_times_percent(percpu=True)

I get this output on my 8 core machine:

<skip first 8 lines>
1.0099999999995646
1.010000000000332
1.0000000000004547
0.9899999999995543
0.9800000000007092
0.9900000000000091
1.0000000000000568
1.0

So even at 1 second intervals, a nontrivial error is introduced (0.98 will be modified to 1.0).

I was wondering how the originally reported problem doesn't show with percpu=False. It does, but the interval has to be small enough (less than 1/cpu_count()):

psutil.cpu_times_percent()
psutil.cpu_percent()
interval = 1 / (psutil.cpu_count() * 2)  # set interval to value significantly below 1/cpu_count
time.sleep(interval)
print(psutil.cpu_times_percent())
print(f'cpu_percent={psutil.cpu_percent()}')

This is the output:

85.1400000000057  # should be ignored
0.499999999996362  # all_delta
scputimes(user=1.0, nice=6.0, system=6.0, idle=37.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0, guest=0.0, guest_nice=0.0)
cpu_percent=26.0

The output of cpu_times_percent is completely off here as well. I added cpu_percent as a reference, as scputimes.idle should be roughly 100 - cpu_percent. It's half that, which makes sense with the all_delta value being changed from ~0.5 to 1.0.

I submitted a pull request (#2413) with the fix by @ulxasc.