[Windows/Linux/Mac] fetch processor / process-scheduler queue metrics

YoshuaNava commented 4 years ago

It would be nice to fetch metrics about the number of processes in the queue, to know when the scheduler is likely to make context switches.

This is a feature that collectl has and I think it's quite relevant for testing of soft real time systems.

giampaolo commented 4 years ago

It would be nice to fetch metrics about the number of processes in the queue

Mmmm. What column of collectl is that? According to this in order to calculate the queue length on Linux you'd have to parse /proc/schedstat and then:

Observe field 8 for each CPU and record the value.
Wait for some interval.
Observe field 8 for each CPU again, and calculate how much the value has increased.
Dividing that difference by the length of the time interval waited

This is annoying because of the "interval" variant. It would mean that on the first call this new function would return a meaningless result (0), similarly to psutil.cpu_percent(), which introduces another headache related to global variables, see #1703. But perhaps if this is possible to do at least also on Windows or OSX maybe it's worth it.

Why exactly do you think this would be useful? Do you have a use case? I don't know much about the scheduler. Also /proc/schedstat provides a lot of info and I wonder what other things can be useful to get from there: https://www.kernel.org/doc/html/latest/scheduler/sched-stats.html

YoshuaNava commented 4 years ago

Hi Giampaolo, Thank you for the prompt and thorough answer.

I use psutil to test usage of computational resource by robotics software that runs in a distributed architecture within the same PC. It uses ROS (Robot Operating System) as middleware to communicate via a modified TCP/IP stack.

At any point in time we might be running N nodes that belong to different classes, that include device drivers, applications and system monitors. These processes might spawn multiple threads to handle communication, events, and achieve their spec.

Thus, it is interesting for us to know how the set of processes interact with each other, how the amount of threads waiting for CPU time affect the access to resources of other processes, and so on.

Having a measure of how process lifetime would be interesting, and the process queue seemed like a good start.

giampaolo commented 4 years ago

Unfortunately I don't know much about the scheduler to understand how useful it would be to know "how many processes are waiting to be executed". It seems useful to me in principle, but it requires more research / knowledge, so I will leave this open in case somebody wants to chime in and provide info re. tools which does this and how they are being used in production.

BTW, I think this may sort of overlap with psutil.getloadavg(). If you just want to fire an alarm in case the machine is "too overloaded", you may do something like:

import psutil, time
while True:
    load = psutil.getloadavg()[0]  # last minute
    load_percent = load / psutil.cpu_count() * 100
    if load_percent > 70:
        print("warning")
    time.sleep(60)  # wait 1 min

YoshuaNava commented 4 years ago

I understand. Thanks.

I will look into the command you sent me and get back to you.

Btw, I have another question, but is more on the theory of how memory measurements are done. I don-t know if I should open an issue for this type of questions that aren't exactly bugs or feature requests. What are your thoughts about this?

giampaolo commented 4 years ago

That's sort of a grey area. =) Theoretically you should use the ml, despite it's very low traffic: https://groups.google.com/forum/#!forum/psutil

YoshuaNava commented 4 years ago

I just tried to write a question in the Google Group, but I can't access it.

Update: after choosing the "Classic Groups Interface", I was able to access.

giampaolo / psutil

[Windows/Linux/Mac] fetch processor / process-scheduler queue metrics #1773