Closed GoogleCodeExporter closed 8 years ago
The other way to handle this that comes to mind is similar to what we
originally did with the first incarnation of psutil; since multiple pieces of
information are retrieved with the same query, populate all those parts of the
process information structure simultaneously, and potentially also cache the
values when possible. Something like the deproxy lazy initialization we were
using in early versions of psutil.
I realize caching might not be feasible if we're looking at volatile
information like memory or CPU, so that would require some thinking. But if we
can fetch multiple pieces of data in one call to NtQuerySystemInformation()
then we're saving a lot of overhead in cases where someone is enumerating
multiple process properties.
My main question/issue with using NtQuerySystemInformation() so heavily is
reliability. Undocumented APIs can change at any time and may not be consistent
from release to release. I'd be concerned that we might run into
incompatibilities in structures across versions, but I don't know if that's the
case. If it's remained consistent across at least the last few Windows releases
then it's probably not a major concern.
Original comment by jlo...@gmail.com
on 13 Jul 2012 at 12:43
I don't think you have to worry about NtQuerySystemInformation being
"undocumented". The chance of it changing is pretty low, since there are a lot
of developers who use it. To address your concerns about performance - you have
to choose between these two:
* NtQuerySystemInformation - a bit slower and requires memory allocation, but
bypasses permissions
* Normal APIs (which all use NtQueryInformationProcess) - a bit faster, but
requires a handle to the process
I like jloden's idea of caching properties for multiple processes, but of
course you have to decide when to update your cached data. That could be a bit
of a problem.
Original comment by wj32...@gmail.com
on 13 Jul 2012 at 12:53
Since we're talking about volatile info I can't see any consistent way to cache
it except introducing a brand new CachedProcess class providing a refresh()
method. Please note that we now have an as_dict() method though, which can be
used to implement caching pretty flexibly by hand, as in:
>>> p = psutil.Process(pid)
>>> p._info = p.as_dict() # save all current process info
>>> p._info['cpu_percent'] # access cached info
...later on:
>>> p._info = p.as_dict() # update() cache
Also, note that I've already introduced caching where possible in latest
release (ppid, name, exe, cmdline and create_time, process_iter() - issue 281
and issue 301).
As for populating the struct simultaneously please note that using
NtQuerySystemInformation() is *a lot* slower than using documented APIs (about
-6x) so I'm not sure how much grouping would help.
Original comment by g.rodola
on 13 Jul 2012 at 2:26
Ok, this is now fixed.
The Process methods affected by this change are:
- create_time r1449
- get_cpu_times() r1448
- get_cpu_percent() r1448
- get_memory_info() r1452
- get_memory_percent() r1452
- get_num_handles() r1450
- get_io_counters() r1451
Note that we're now able to determine meaningful info even for PID 4, which
return value was historically hard-coded in the python layer.
Original comment by g.rodola
on 13 Jul 2012 at 8:25
Fixed in version 0.6.0, released just now.
Original comment by g.rodola
on 13 Aug 2012 at 4:25
[deleted comment]
Updated csets after the SVN -> Mercurial migration:
r1448 == revision 5c5b03f3643f
r1449 == revision 2dcc29d215a7
r1450 == revision 6960659dac04
r1451 == revision 48a6b054f238
r1452 == revision 85677a2caf85
Original comment by g.rodola
on 2 Mar 2013 at 12:11
Original issue reported on code.google.com by
g.rodola
on 13 Jul 2012 at 11:31