giampaolo / psutil

Cross-platform lib for process and system monitoring in Python
BSD 3-Clause "New" or "Revised" License
10.08k stars 1.37k forks source link

[Windows 11] psutil.process_iter() is 10x slower when running from non-admin account than when running from ADMIN/elevated account #2366

Closed smihaila closed 4 months ago

smihaila commented 5 months ago

Summary

Description

Running the following Python code from a non-admin Windows user account, takes about 400ms. And when running the same code from a the same Windows user account, but through an ADMIN/elevated cmd.exe command prompt, it takes about 38-40ms, which is 10x faster.

The total exec time does not seem to be influenced by whether the process searched for, is currently running or not. Also, when the process searched for is running, it's always only 1 instance of it, so no multiple processes of the same name.

import psutil

p: psutil.Process | None = next(
    (p for p in psutil.process_iter(attrs=["name"]) if p.name() == "dbengprx.exe"),
    None)

The code above is invoked indirectly, via Uvicorn ASGI web application server (configured with only 1 worker) and FastAPI web api framework.

Same machine, same Virtual Mem usage, same list or processes between the two cases. About 220 processes showing in TaskMgr / tasklist, in both cases.

There is something that makes psutil's Process item generator go 10x slower in non-admin mode, than in admin mode.

Thank you.

tduarte-dspc commented 4 months ago

On my Windows 11 machine, I get 5 to 8 seconds to iterate over 427 processes. I'm running Python 3.11.3. It's a company laptop but I'm the administrator. It's a 12th Gen Interl i7-12800H

import time
from typing import Tuple

import psutil

def proc_by_name(process_name: str) -> Tuple[bool, int]:
    start = time.time()
    for proc in psutil.process_iter(attrs=["name"]):
        if process_name in proc.name():
            print("On success took:", time.time() - start, "seconds")
            return True, proc.pid

    print("On failure took:", time.time() - start, "seconds")
    return False, 0

if __name__ == "__main__":
    running, pid = proc_by_name("unknown.exe")
    print(f"unknown.exe: {running=}, {pid=}")

    running, pid = proc_by_name("chrome.exe")
    print(f"chrome.exe: {running=}, {pid=}")

    print(len(list(psutil.process_iter(attrs=["name"]))))

An iteration on the terminal:

On failure took: 5.50777530670166 seconds
unknown.exe: running=False, pid=0
On success took: 0.7929043769836426 seconds
chrome.exe: running=True, pid=3024
427
giampaolo commented 4 months ago

This is known. Certain APIs have 2 implementations, a fast one and a slow one.

The fast one is attempted first, but requires more privileges, and hence often fail with AccessDenied for processes not owned by the current user or with low PIDs. The slow implementation is used as fallback: it's slower but it manages to return info instead of raising AccessDenied. This is the reason why running a benchmark as a super user vs normal user produces different results. This is the best psutil can do in this regard, and there's nothing we can do about it (well, by installing psutil as a service / driver perhaps that'd be possible, but that's another story).

Some examples of "dual" APIs are:

smihaila commented 4 months ago

Thank you, @giampaolo . I wasn't aware of a dual implementation driving parts of the psutil package. Now that you were explaining it, it makes perfect sense.

Now, assuming that the process I wish to test the existence for (and getting additional info from, such as virtual mem usage metrics, or say IPv4 TCP sockets opened), is always owned by the same user account invoking psutil (with 2 sub-cases: such user account being LOCAL SERVICE or a normal non-system user account), I have a rather stupid question:

Is there a way to check solely for such process, and get info about it in a faster way than psutil.process_iter() generator + filtering logic? We know the faster win32 API is always leveraged in such case, but can it be made even faster, by querying only for a specific process name? Or would the perf gain be minimal in a "more focused" query? It's like the difference at Win32 API / C++ level between finding a process by name, vs. enumerating all processes, and which is not non-negligible.

As @tduarte-dspc just exemplified very concisely (and even when running under an admin account, which presumably engages the faster API even if such account is not LOCAL SERVICE or NT AUTHORITY\SYSTEM), enumerating all running processes, to arriving at a negative / not found case, is always sensibly slower than the positive / process found case. So, can the response time be made deterministically constant in both the "not found" and "found" case?

Thank you.

giampaolo commented 4 months ago

It depends on what criteria you use to identify the process you're looking for. Is it based on cmdline() (which has dual implementation)?

E.g. cmdline() has dual implementation, but username() doesn't. Assuming username() never fails with AccessDenied (which I don't know), and assuming you pre-emptively know that the process you're looking for is owned by your user, perhaps you can do something like (not tested):

import psutil, os

myuser = os.getlogin()
mycmdline = ["C:\\python310\\python.exe", "foo.py"]

for p in psutil.process_iter():
    try:
        if p.username() == myuser and p.cmdline() == mycmdline:
            print(f"found {p}")
    except psutil.Error:
        pass

With that said (mostly note to self): it would make sense to debug-log APIs which use the dual implementation, so one can identify performance bottlenecks by running psutil in PSUTIL_DEBUG mode: https://psutil.readthedocs.io/en/latest/#debug-mode

smihaila commented 4 months ago

Well, my question was mostly about finding a way to avoid iterating through the list of all processes, i.e. how to avoid for p in psutil.process_iter(): [...], via some hypothetical psutil.get_process_info(processName).

Probably it's not supported in the current psutil implementation. That's fine @giampaolo , and thank you for what you are doing, and for everyone's contribution to this project.

Within everyone's agreement, I'll close this issue, since it's proven to work as designed, and it's not a defect.

Thanks again for everybody's time, and all the best.