Closed GoogleCodeExporter closed 8 years ago
Mmm... this is debatable.
Some considerations:
1 - If 'interval' parameter is not specified or > 0 you *do* get a meaningful
value.
2 - I expect that someone passing 0.0 or None knows what he's doing and also
knows that the first call will return a meaningless value.
3 - None is also a meaningless value, but it is also kind of inconsistent,
since you're changing the returned type
> I advocate to return None otherwise it is impossible to
> tell whether the returned value is valid or just made up
> you can check process._last_sys_cpu_times, which I currently
> do, but this is pretty ugly.
I think you shouldn't check anything at at all but instead just call
get_cpu_percent() right after you created the Process object.
From then on, all the future calls to get_cpu_percent() will return a
meaningful value, assuming you will pass a decent interval:
p = psutil.Process(pid)
p.get_cpu_percent(interval=None) # noop; just initialize CPU times internally
and return immediately
Your proposal would look like this, which looks like unnecessarily verbose to
me by comparison:
p = psutil.Process(pid)
percent = p.get_cpu_percent()
if percent is None:
...
else:
...
Original comment by g.rodola
on 12 Dec 2012 at 3:34
Original comment by g.rodola
on 12 Dec 2012 at 3:34
> 1 - If 'interval' parameter is not specified or > 0 you *do* get a meaningful
value.
That is actually the main issue. If I get 0.0 I simply cannot decide whether it
is a valid value or whether I got it because it is the first call to
get_cpu_percent().
Your approach to just call get_cpu_percent works great if I create the process
object myself. However, I use psutil.process_iter(). Thus, I simply do not know
whether the call to get_cpu_percent is the first one.
I could work around by caching all known objects to figure out whether I touch
the object for the first time; or I simply have a look at _last_sys_cpu_times.
I don't like any of the 2 options.
I understand that you hesitate to return None. So what about an exception?
Original comment by m...@bruenink.de
on 13 Dec 2012 at 5:48
Please help me understand what your use case is exactly.
First of all, why do you care about the meaningfulness of the first call's
value?
As for process_iter(), it already uses an internal cache, so this code will
produce meaningful values, except for the very first iteration:
while 1:
for p in psutil.process_iter():
print p.get_cpu_percent(interval=None)
time.sleep(1)
Original comment by g.rodola
on 13 Dec 2012 at 3:17
I have a program which observes resource consumption of processes. Basically I
use exactly the loop you mentioned above. Based on the observations I do some
system adaptations.
If the system creates a new process (that is I don't create the process; so I
don't know the process is a new one) I ll get a wrong value in the first
iteration. Assume this new process eats up 100% CPU. But get_cpu_percent() will
just return 0.0. So my monitor will think the system is not used at all even
though it is.
I know it is pretty much impossible to get a 100% precise measurement. However,
I would like to know how imprecise my measurement is. I would like to have a
statement like this: "I think the CPU utilisation is XX%. However, there a X
processes I do not know anything about. So be careful how you interpret this
measurement."
There are ways around it, but I think the cleanest one would be to
differentiate explicitly between a valid and an invalid return value in
get_cpu_percent()
Original comment by m...@bruenink.de
on 14 Dec 2012 at 2:36
Ok, but why is it so important for you to figure out whether the value is
meaningful or not?
What action do you want take when it's not? Is it just a matter of taste (e.g.
display "N/A" rather than "0.0") or do you actually need to know that up front?
I'm thinking that maybe we can provide a new 'default=0.0' argument for
get_cpu_percent() as in:
>>> p.get_cpu_percent(timeout=None, default='none yet')
'none yet'
>>> p.get_cpu_percent(timeout=None, default='none yet')
1.3
Would that be helpful for your use case?
I'm not convinced about introducing None because then you'll have to check the
return type, and you'll also introduce backward compatibility issues.
The first examples which come to mind are:
>>> if p.get_cpu_percent(timeout=None) > 50: ...
...on Python 3 which might raise TypeError and:
>>> "%s %" % p,get_cpu_percent()
...which will print 'None %' all of the sudden.
Original comment by g.rodola
on 14 Dec 2012 at 4:01
Is it feasible for you to pass a short interval to get_cpu_percent() in your
script? To me any time you're calling a CPU percent script you essentially have
to expect to introduce a delay. Even if you look at something like WMI counters
for CPU on Windows, it just builds the delay into the method so that any time
you call it it's waiting a set period of time to calculate the delta.
In our case we're just making it optional using the argument to
get_cpu_percent(), except you're deliberately disabling that by calling it with
get_cpu_percent(0) so we have very little recourse other than returning None as
an alternative (Giampaolo has already explained why we didn't go that route
with the code initially). Personally I'd be more inclined to either:
1) change your calling script accordingly by setting a small interval, or
possibly even do something like a dictionary to ensure that the first time the
process appears you ignore the CPU value (if the dictionary has the key, you
know it's already been seen)
2) adopt something like Giampaolo's suggestion with a parameter to set a
default return value so you can force it to return None instead of 0.0
3) Actually another idea that comes to mind as I type this, maybe we return a
negative value (-1?) for CPU percent for the first call of the process? That
would prevent return type from changing at least, though it might break other
code I'm not thinking of offhand? Giampaolo - any thoughts?
Original comment by jlo...@gmail.com
on 14 Dec 2012 at 4:16
@Jay: setting a small interval doesn't really solve the issue: you're just
introducing a slowdown and the returned value keeps being meaningless. Using
interval=0 is the expected usage if you want to use get_cpu_percent() in
non-blocking mode, and non-blocking mode is exactly what you want in
applications such as a task manager.
Using -1 is less disruptive than None but IMO it's kind of inconsistent and
still requires to check the return value, which leads to more verbose code.
But let's first wait to see what OP's exact use case is.
I'm reluctant to change anything because *personally* I do not care about 0.0
being meaningless the first time, and can't imagine a reason why I'd want to
check that except for, say, printing "N/A" instead of 0.0, but that's me.
Original comment by g.rodola
on 14 Dec 2012 at 12:45
Basically I do adaptive system monitoring. I monitor the resource consumption
of the system and adapt my monitoring overhead accordingly. That's why it is
important to differentiate between 0.0 and N/A.
However, I am ok with checking _last_sys_cpu_times. So far it does work as it
should.
It is just that it is a bit hackish and it does not feel right. There should be
a difference between 0.0 and N/A.
Both of you are right when you talk about backward compatibility and the like.
So you just might leave it as it is. I just think the interface is not very
clean if you do not distinguish between an invalid and an valid return value.
But in the end we are talking about a corner case (which should still be fixed
:)
Original comment by m...@bruenink.de
on 15 Dec 2012 at 5:48
In 2.0.0 we enforced the fact that the first call gives you 0.0 and made that
very clear in the doc, therefore I'm just closing this out.
Original comment by g.rodola
on 21 Apr 2014 at 4:06
Original issue reported on code.google.com by
m...@bruenink.de
on 26 Sep 2012 at 8:57