shanecode / psutil

Automatically exported from code.google.com/p/psutil
Other
0 stars 0 forks source link

incorrect per-process CPU usage reported [Windows] #474

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.Create a psutil.Process object for an existing PID
2.Call get_cpu_percent() on the Process object
3.Value returned is capped at 100

What is the expected output?
The value returned fits the per-process CPU % from Windows Performance monitor 
(perfmon.exe) "Process/% Processor Time" counter, which is not capped at 100 
(the max is 100 * NB_CORES)

What do you see instead?
It is capped at 100

What version of psutil are you using? What Python version?
psutil 1.2.1
python 2.7.6

On what operating system? Is it 32bit or 64bit version?
Windows 8 Enterprise, 64-bit

Please provide any additional information below.

I already saw bug 194 that has been closed where it is seen as an error to 
report values higher than 100 on Unix platforms. I would not have any 
particular issue with that on Windows either, as long as the reported values 
are scaled accurately, which they are not.

I am currently running performance tests on a multithreaded program of mine, 
and I was surprised to see that the values reported by your tool did not 
correspond at all to the values I read in the task manager. I then investigated 
the issue and found that they rather correspond to the values reported by the 
aforementioned counter in Performance monitor (procmon.exe). The problem is, as 
soon as the value raises above 100 (which happens frequently on a multicore 
system), they are clipped to 100, which gives wrong values. 

In its current state, I cannot use this library for the intent I planned for it.

Thank you for looking into this. I can help solve the bug if it is approved.

Original issue reported on code.google.com by francois...@gmail.com on 10 Feb 2014 at 11:42

GoogleCodeExporter commented 9 years ago
According to issue 194 the reason why I decided to limit CPU percentage to 100% 
on Windows was because taskmgr.exe does the same thing.
Now it appears you're saying the opposite.
Maybe this is true on Windows 8 only? I don't know (I test psutil on Windows 7).

Question: if you manually remove the 100% limit I set in psutil code and 
compare psutil values with taskmgr's and procmon.exe do they look similar?

Personally I don't have any particular reason NOT to remove the limit as long 
as the return value is correct across different Windows versions  as it is on 
UNIX.
I'd say one way to investigate this would be to run a program spawning 2 
threads, each thread running a "while 1: pass" loop.
On a 10-cores system I would expect to see one process reporting 200% CPU usage.

Side note: such a test cannot be done in cPython because of the GIL so we'd 
have to cook that up by using another language or interpreter (IronPython or 
Jython).

Original comment by g.rodola on 14 Feb 2014 at 12:52

GoogleCodeExporter commented 9 years ago
Hi,

The task manager indeed shows values normalized (not capped) on a 0-100% base. 
However, the value that Psutil gets from the system is not normalized (not 
divided by the number of cores in the system), so it makes no sense to cap it. 

I built locally a version where I just removed the cap and it seems to work 
perfectly fine. I can perform the test you asked for in the coming days to 
confirm it behaves as expected on more CPU-intensive workflows, as mine does 
not go much higher than 120%, but for political/legal reasons, the company 
where I work won't let me contribute directly to the project (I know it 
sucks...). Since the fix is only to remove the cap though, I'm sure it won't be 
too much of a hassle to do it yourself.

So I'll get back to you with test results to confirm my theories in the coming 
days. 

Original comment by francois...@gmail.com on 14 Feb 2014 at 2:32

GoogleCodeExporter commented 9 years ago
> the company where I work won't let me contribute directly to the
> project (I know it sucks...)

I'm sorry. That's just stupid.
I wish there would be a license not allowing such companies to use open source 
stuff.

Original comment by g.rodola on 14 Feb 2014 at 4:56

GoogleCodeExporter commented 9 years ago
I agree, but I cannot change that. I am currently bending the rules opening 
this issue and performing tests to get this fixed instead of just forking it 
locally on our server...

As for the results:
1) I coded a tiny c++11 application that busy loops on 4 threads:

#include "stdafx.h"
#include <thread>
#include <iostream>

int _tmain(int argc, _TCHAR* argv[])
{
   std::thread* aThreads[4];
   bool bStop = false;

   for(int i = 0; i < 4; i++)
   {
      aThreads[i] = new std::thread([&bStop]()
      { 
         while(1)
         {
            // perform useless computation
            int x = 3 + 4;
            int y = x * 2;
            if(bStop)
               break;
         }            
      });
   }

   std::cin >> bStop;

   for(int i = 0; i < 4; i++)
   {
      if(aThreads[i])
      {
         aThreads[i]->join();
         delete aThreads[i];
      }
   }
    return 0;
}
--------------

2) I then use a python script to launch the process and grab its cpu_percent 
values with the 100 percent cap removed:

#!/usr/bin/python

import sys
import psutil
import subprocess

def go():
   # creating process to run application
   process = subprocess.Popen(["400percentbusy.exe"])

   p = psutil.Process(process.pid)
   if p.is_running():
      cpu_vals = []
      print "Grabbing results for the next 10 seconds..."
      for i in range(0,10):
         try:
            cpu_vals.append(p.get_cpu_percent(1))
            print 'cpu: {}'.format(str(cpu_vals[i]))
         except:
            print "Process ended while still taking measures! Accuracy might be affected."
            break
      print ''
      print 'SUMMARY:'
      print 'CPU     - MIN: {} MAX: {} AVG: {}'.format(str(min(cpu_vals)), str(max(cpu_vals)), str(sum(cpu_vals)/float(len(cpu_vals))))
   else:
      print 'invalid PID'

-----------------

3) I get the following output:

>>> test400percent.go()
Grabbing results for the next 1 minute...
cpu: 398.9
cpu: 402.0
cpu: 398.1
cpu: 396.5
cpu: 398.2
cpu: 410.1
cpu: 398.9
cpu: 398.2
cpu: 401.2
cpu: 398.9

SUMMARY:
CPU     - MIN: 396.5 MAX: 410.1 AVG: 400.1

---------------

This comfirms that the value grabbed by psutil is the sum of the percentages 
from all the cores in the system, not normalized and as such, it should not be 
capped to 100.

I hope you can overlook your frustration and perform the fix, as I think I did 
the best I could considering the circumstances and I think your project can 
benefit from it. If not, well I wish you good luck in your project(s), as you 
made a really useful piece of software.

Thanks for your help

Original comment by francois...@gmail.com on 18 Feb 2014 at 4:44

GoogleCodeExporter commented 9 years ago
Fixed in revision fac74cc73582.
Thanks for detailed insights.

Original comment by g.rodola on 24 Feb 2014 at 1:34

GoogleCodeExporter commented 9 years ago
Closing out as fixed as 2.0.0 version is finally out.

Original comment by g.rodola on 10 Mar 2014 at 11:36