Closed GoogleCodeExporter closed 9 years ago
Did you have anything in mind on how to speed this up? Have you profiled this
at all
to see what takes the most time? My guess (but I could be wrong) is that it's
the
creation of the new Process object in is_running() and not the code in __eq__
that
takes the most time. The __eq__() code is just using built-in functions to read
some
attributes and compare them, so it shouldn't be all that slow.
If it turns out that __eq__() is the culprit after all then one thing that
comes to
mind is selecting a specific subset of items to search for equality instead of
checking all of them. For example, just look at PID, ppid, name, command line,
path.
That would cut down the number of items that are being checked and also
eliminate
several function calls in the body of __eq__ that are currently being used,
including
the string operation for startswith() and the check for callable() etc.
Original comment by jlo...@gmail.com
on 14 Jul 2009 at 4:22
One problem is that we are comparing against too many properties: currently
*everything* the Process class has to offer except callables and private
methods.
It is true that __eq__ uses fast builtin functions for comparison, but every
time it
asks for a property, that's time spent on calling the underlying C code and we
should
avoid that whenever possible.
What I had in mind was to determine a reliable and *limited* subset of
properties to
use as a "signature" to identify a Process object uniquely.
Given the fact that it's unlikely that the kernel will reuse the same PID for a
short
amount of time, mixing (pid + process creation time) already gives us a discrete
amount of uniqueness:
def __eq__(self, other):
h1 = (self.pid, self.create_time)
h2 = (other.pid, other.create_time)
return h1 == h2
Since we're not sure about the kernel behavior across platforms when it comes to
assign new PIDs we could need to add more values to enforce such uniqueness by
picking some other properties but I'm not sure which ones exactly.
I'd be for using cmdline but the underlying C call determining it also
determines
ppid, name and path in one shot, hence it couldn't be the best choice.
Thoughts?
Original comment by billiej...@gmail.com
on 14 Jul 2009 at 5:45
I think PID + create time is good enough, since a process can't have both a
reused
PID and the same create time in any normal circumstance I can come up with. That
should speed things up a bunch.
I'm not sure why is_running got coded this way:
def is_running(self):
"""Return whether the current process is running in the current process
list."""
try:
new_proc = Process(self.pid)
# calls get_process_info() which may in turn trigger NSP exception
str(new_proc)
except NoSuchProcess:
return False
return self == new_proc
That's going to be much slower because the call to str() is forcing the new_proc
Process object to fill out all the attributes by calling the C code before we
check
for equality. Whatever the reason, if we change that around like the below it
should
work fine and be much faster after the changes are made to __eq__()
def is_running(self):
"""Return whether the current process is running in the current process
list."""
try:
new_proc = Process(self.pid)
return self == new_proc
except NoSuchProcess:
return False
Original comment by jlo...@gmail.com
on 14 Jul 2009 at 9:02
Committed as r416.
Before the patch:
$ python -m timeit -s "import os, psutil; p = psutil.Process(os.getpid())"
"p.is_running()"
1000 loops, best of 3: 1.29 msec per loop
After the patch:
$ python -m timeit -s "import os, psutil; p = psutil.Process(os.getpid())"
"p.is_running()"
10000 loops, best of 3: 135 usec per loop
That's about 10 times faster.
Original comment by billiej...@gmail.com
on 15 Jul 2009 at 9:35
Original comment by billiej...@gmail.com
on 3 Sep 2009 at 7:48
Original comment by billiej...@gmail.com
on 17 Sep 2009 at 8:57
[deleted comment]
Updated csets after the SVN -> Mercurial migration:
r416 == revision 498c34a2245c
Original comment by g.rodola
on 2 Mar 2013 at 11:50
Original issue reported on code.google.com by
billiej...@gmail.com
on 14 Jul 2009 at 3:59