giampaolo / psutil

Cross-platform lib for process and system monitoring in Python
BSD 3-Clause "New" or "Revised" License
10.32k stars 1.39k forks source link

[OS] cpu_percent() is independent of number of CPUs #1700

Open nbigaouette opened 4 years ago

nbigaouette commented 4 years ago

Platform

Bug description

I think there might be a problem with the definition of cpu_percent() which result in the percentage being calculated does not contain a number of CPUs.

There is a timer() function defined inside the cpu_percent() function that multiplies the time with the number of cpus: https://github.com/giampaolo/psutil/blob/release-5.7.0/psutil/__init__.py#L998-L999

This function is then used to get a time difference between st2 and st1 or self._last_sys_cpu_times: https://github.com/giampaolo/psutil/blob/release-5.7.0/psutil/__init__.py#L1002-L1005

This is used to calculate delta_time: https://github.com/giampaolo/psutil/blob/release-5.7.0/psutil/__init__.py#L1018

So at this point, delta_time contains the CPU count.

Then, delta_time is used in the denominator of the overall_cpus_percent calculation: https://github.com/giampaolo/psutil/blob/release-5.7.0/psutil/__init__.py#L1027

So overall_cpus_percent contains a 1/num_cpus contribution.

Finally, single_cpu_percent is being scaled back by num_cpus, cancelling the previous 1/num_cpus contribution: https://github.com/giampaolo/psutil/blob/release-5.7.0/psutil/__init__.py#L1047

This means that single_cpu_percent is independant of num_cpus!

So I have to conclude that either:

  1. Using num_cpus is redundant and can be removed;
  2. The result of cpu_percent() is not properly scaled by the number of CPUs.

For reference here's the definition of the function: https://github.com/giampaolo/psutil/blob/release-5.7.0/psutil/__init__.py#L958-L1048

giampaolo commented 4 years ago

Can you provide a diff or PR? That way I can have a better understanding of the problem. cpu_percent() functions are delicate (and also the most used ones), so they should be handled with care.

nbigaouette commented 4 years ago

Here's a diff of what I believe might be the exact thing the code does:

diff --git a/psutil/__init__.py b/psutil/__init__.py
index 22bb46f3..d3e20535 100644
--- a/psutil/__init__.py
+++ b/psutil/__init__.py
@@ -993,10 +993,9 @@ class Process(object):
         blocking = interval is not None and interval > 0.0
         if interval is not None and interval < 0:
             raise ValueError("interval is not positive (got %r)" % interval)
-        num_cpus = cpu_count() or 1

         def timer():
-            return _timer() * num_cpus
+            return _timer()

         if blocking:
             st1 = timer()
@@ -1024,28 +1023,10 @@ class Process(object):
             # This is the utilization split evenly between all CPUs.
             # E.g. a busy loop process on a 2-CPU-cores system at this
             # point is reported as 50% instead of 100%.
-            overall_cpus_percent = ((delta_proc / delta_time) * 100)
+            return round(((delta_proc / delta_time) * 100), 1)
         except ZeroDivisionError:
             # interval was too low
             return 0.0
-        else:
-            # Note 1:
-            # in order to emulate "top" we multiply the value for the num
-            # of CPU cores. This way the busy process will be reported as
-            # having 100% (or more) usage.
-            #
-            # Note 2:
-            # taskmgr.exe on Windows differs in that it will show 50%
-            # instead.
-            #
-            # Note 3:
-            # a percentage > 100 is legitimate as it can result from a
-            # process with multiple threads running on different CPU
-            # cores (top does the same), see:
-            # http://stackoverflow.com/questions/1032357
-            # https://github.com/giampaolo/psutil/issues/474
-            single_cpu_percent = overall_cpus_percent * num_cpus
-            return round(single_cpu_percent, 1)

     @memoize_when_activated
     def cpu_times(self):

I do not use psutil but found this when investigating a problem in another project that was inspired by it (see https://github.com/heim-rs/heim/issues/199).

I'll try to come up with an example.

nbigaouette commented 4 years ago

Here's a code snippet that can be used for measurement:

#!/urs/bin/env python

import os
import time
from multiprocessing import Process

# psutil 5.7.0
import psutil

NB_PROCESSES = 8
BUSY_LOOP_COUNT = 300000000

def busy_loop():
    for _i in range(BUSY_LOOP_COUNT):
        pass

print("cpu_count:         ", psutil.cpu_count())
print("cpu_count_logical: ", psutil._psplatform.cpu_count_logical())
print("cpu_count_physical:", psutil._psplatform.cpu_count_physical())

p = psutil.Process(os.getpid())
psutil.cpu_percent(interval=None)

print("Starting busy loops in threads...")

p.cpu_percent(interval=None)

t0 = time.time()

processes = [
    Process(target=busy_loop)
    for _n in range(NB_PROCESSES)
]

print("Starting processes...")
for process in processes:
    process.start()

print("Joining processes...")
for process in processes:
    process.join()

t1 = time.time()

print("Processes done")

process_percent = p.cpu_percent(interval=None)
host_percent = psutil.cpu_percent(interval=None)
print("process_percent:", process_percent)
print("host_percent:   ", host_percent)
print("delta t: ", t1-t0, "seconds")

It spawns a number of processes and busy loop them. This should occupy the cores for 100% during the busy loop. If NB_PROCESSES matches the number of cores on your machine, I expect cpu_percent() to return roughly 100*NB_PROCESSES.

Running the code on my laptop (macOS, quad core + hyperthreading == 8 virtual cores) I get the following output:

❯ poetry run python ./issue-1700.py
cpu_count:          8
cpu_count_logical:  8
cpu_count_physical: 4
Starting busy loops in threads...
Starting processes...
Joining processes...
Processes done
process_percent: 0.1
host_percent:    97.5
delta t:  16.766708850860596 seconds

htop shows my 8 cores at close to 100% with each python processes taking around 80% each: Screen Shot 2020-02-19 at 11 35 13 AM

I will try to run the same code on a fork of psutil where num_cpu is not used in cpu_percent().

NOTE: Surprisingly, the way I call cpu_percent() returns ~100%, not the 800 I expected...

nbigaouette commented 4 years ago

Sorry I got confused with the two cpu_percent() function/method. Here's an updated version that measure the usage of the process:

#!/urs/bin/env python

import os
import time
from multiprocessing import Process

# psutil 5.7.0
import psutil

BUSY_LOOP_COUNT = 300000000

def busy_loop():
    for _i in range(BUSY_LOOP_COUNT):
        pass

print("cpu_count:         ", psutil.cpu_count())
print("cpu_count_logical: ", psutil._psplatform.cpu_count_logical())
print("cpu_count_physical:", psutil._psplatform.cpu_count_physical())

p = psutil.Process(os.getpid())
psutil.cpu_percent(interval=None)

print("Starting busy loop...")

p.cpu_percent(interval=None)

t0 = time.time()

busy_loop()

t1 = time.time()

print("Processes done")

process_percent = p.cpu_percent(interval=None)
host_percent = psutil.cpu_percent(interval=None)
print("process_percent:", process_percent)
print("host_percent:   ", host_percent)
print("delta t: ", t1-t0, "seconds")

with output:

❯ poetry run python ./issue-1700-process.py
cpu_count:          8
cpu_count_logical:  8
cpu_count_physical: 4
####################################################
This is psutil 5.7.0 forked
####################################################
Starting busy loop...
Processes done
####################################################
This is psutil 5.7.0 forked
####################################################
process_percent: 98.28840902680246
host_percent:    23.9
delta t:  6.271931886672974 seconds

Here's the output of release-5.7.0 (f2e0c98e):

❯ poetry run python ./issue-1700-process.py
cpu_count:          8
cpu_count_logical:  8
cpu_count_physical: 4
Starting busy loop...
Processes done
process_percent: 98.5
host_percent:    23.5
delta t:  6.294642925262451 seconds

Note how the process_percent are both close to 100%: they are independent of the number of CPUs.

That's the diff of the fork (release-5.7.0, f2e0c98e):

diff --git a/psutil/__init__.py b/psutil/__init__.py
index 22bb46f3..efe3fdf2 100644
--- a/psutil/__init__.py
+++ b/psutil/__init__.py
@@ -993,10 +993,9 @@ class Process(object):
         blocking = interval is not None and interval > 0.0
         if interval is not None and interval < 0:
             raise ValueError("interval is not positive (got %r)" % interval)
-        num_cpus = cpu_count() or 1

         def timer():
-            return _timer() * num_cpus
+            return _timer()

         if blocking:
             st1 = timer()
@@ -1024,28 +1023,10 @@ class Process(object):
             # This is the utilization split evenly between all CPUs.
             # E.g. a busy loop process on a 2-CPU-cores system at this
             # point is reported as 50% instead of 100%.
-            overall_cpus_percent = ((delta_proc / delta_time) * 100)
+            return ((delta_proc / delta_time) * 100)
         except ZeroDivisionError:
             # interval was too low
             return 0.0
-        else:
-            # Note 1:
-            # in order to emulate "top" we multiply the value for the num
-            # of CPU cores. This way the busy process will be reported as
-            # having 100% (or more) usage.
-            #
-            # Note 2:
-            # taskmgr.exe on Windows differs in that it will show 50%
-            # instead.
-            #
-            # Note 3:
-            # a percentage > 100 is legitimate as it can result from a
-            # process with multiple threads running on different CPU
-            # cores (top does the same), see:
-            # http://stackoverflow.com/questions/1032357
-            # https://github.com/giampaolo/psutil/issues/474
-            single_cpu_percent = overall_cpus_percent * num_cpus
-            return round(single_cpu_percent, 1)

     @memoize_when_activated
     def cpu_times(self):
@@ -1737,6 +1718,10 @@ def cpu_percent(interval=None, percpu=False):
     if interval is not None and interval < 0:
         raise ValueError("interval is not positive (got %r)" % interval)

+    print("####################################################")
+    print("This is psutil 5.7.0 forked")
+    print("####################################################")
+
     def calculate(t1, t2):
         times_delta = _cpu_times_deltas(t1, t2)