giampaolo / psutil

Cross-platform lib for process and system monitoring in Python
BSD 3-Clause "New" or "Revised" License
10.25k stars 1.38k forks source link

Extend cpu_count() API to get sockets and NUMA nodes count #1392

Open s-m-e opened 5 years ago

s-m-e commented 5 years ago

I could not find anything related in the documentation. I am currently using a "hack" working on some Intel-based systems: Messing around with temperature sensors, but it really is not clean ...

len([
    None for sensor in psutil.sensors_temperatures()['coretemp']
    if 'Physical id' in sensor.label
    ])

EDIT: The above is working on Linux x86_64 for 4.4, 4.10 and 4.15 kernels.

giampaolo commented 5 years ago

Yes, checkout psutil.cpu_count().

s-m-e commented 5 years ago

@giampaolo Maybe I did not make myself clear: psutil.cpu_count() gives the number of CPU cores. I am interested in the number of actual CPUs (not their cores), equivalent to the number of sockets. This is interesting when running on servers with for instance two (or more) CPUs (i.e. in two sockets). In this case, psutil.cpu_count() will return the combined core count of both, without any information on how many CPUs the system actually has and how many cores each individual CPU has.

giampaolo commented 5 years ago

Ah you mean the number of physical sockets. Generally speaking one wants cpu_count(logical=True), which is the same as os.cpu_count() and which includes hyper-threaded CPUs. That is generally useful in a multi processing app. psutil does a bit more and allows cpu_count(logical=False) which returns the number of cores. The number of physical sockets is not implemented, basically because it's a rare use case.

s-m-e commented 5 years ago

@giampaolo Sorry for the confusion and thanks for your reply. A lot of servers, cloud or otherwise, tend to have more than one socket - and the bandwidth between the sockets is usually significantly lower than between the cores of one individual CPU. This is why you want to avoid spreading certain parallel workloads across multiple sockets. Having this information from psutil would be incredibly helpful :)

giampaolo commented 4 years ago

Re-opening given the recent discussion at https://github.com/giampaolo/psutil/pull/1727. It turns out that knowing the number of CPU sockets is desirable after all. The point is how to expose this in terms of API. Alex @amanusk suggests: https://github.com/giampaolo/psutil/pull/1727#issuecomment-699450658 Also, we have another possible API addition re. the number of NUMA nodes (#1610) which should probably be taken into account in terms of API design.

giampaolo commented 4 years ago

OK, here's a bit of brainstorming. Currently psutil is able to return logical (hyper threading) and physical cores. The goal is to provide CPU sockets count (and possibly others). IMO, the ideal API if we were to start from scratch today would be having a single function accepting a kind parameter, similar to psutil.net_connections(kind='all'). That would be simple and extensible in terms of back compatibility. It would look like this:

# number of logical / hyper-threading CPU cores, same as os.cpu_count()
psutil.cpu_count(kind="logical")  

# number of physical cores (currently supported on all platforms except OpenBSD and NetBSD)
psutil.cpu_count(kind="cores")

# number of sockets
psutil.cpu_count(kind="sockets")  

# number of usable CPUs, aka len(os.sched_getaffinity(0)) on Linux
# or len(psutil.Process().cpu_affinity())
psutil.cpu_count(kind="usable")

# number of NUMA nodes
psutil.cpu_count(kind="numa")

(similarly to os.cpu_count(), if the value can't be determined we'll just return None)

The current function signature unfortunately is:

psutil.cpu_count(logical=True)

What we MAY do in order to keep supporting logical parameter and avoid code breakage is this:

psutil.cpu_count(kind='logical', logical=None)

If the function is invoked as such we will assume the user is asking for logical cores:

>>> psutil.cpu_count()
8
>>> psutil.cpu_count(True)
DeprecationWarning('use of boolean as first parameter is deprecated, use kind="logical"')
8
>>> psutil.cpu_count(logical=True)
DeprecationWarning('"logical" parameter is deprecated, use kind="logical"')
8

If the function is invoked as such we will assume the user is asking for physical cores:

>>> psutil.cpu_count(False)
DeprecationWarning('use of boolean as first parameter is deprecated; use kind="cores"')
4
>>> psutil.cpu_count(logical=False)
DeprecationWarning('"logical" parameter is deprecated; use kind="cores"')
4
santagada commented 3 years ago

I think calling all that cpu just makes the code more confusing, why not: numa_count(), group_count(), socket_count(), cpu_count(group=, numa=) ?

Then you can get the number of logical/physical cpus in a group or numa node and also the numa and group count. For windows you still need to know in which group is each numa node so something like numa_group(numa=) is also needed.

A numa node has a group (on cpus with more than 64 logical cores in the same numa node windows actually creates virtual numa nodes for them) and an affinity mask in that group as sometimes machines with < 64 total logical cpus but more than one numa node will get their cpus as different affinity masks on the same group

eg. a 2 socket machine with two 20 logical core die will get 2 numa nodes, 1 group and an affinity mask of the first 20 logical cores for numa node 1 and the rest for numa node 2.

giampaolo commented 3 years ago

@santagada

I think calling all that cpu just makes the code more confusing, why not: numa_count(), group_count(), socket_count(), cpu_count(group=, numa=)?

Barring a few exceptions, all APIs start with cpu_*, disk_*, net_*, sensors_*, ... prefix, so keeping the cpu_* part is convenient in that sense (+ we won't have to deprecate cpu_count()).

Then you can get the number of logical/physical cpus in a group or numa node and also the numa and group count. For windows you still need to know in which group is each numa node so something like numa_group(numa=) is also needed.

Mmm that complicates things quite a bit. I'm not sure how to express that in terms of API. To my understanding, and judging from lscpu output on Linux, we have 2 kind of info: number of NUMA nodes and what CPUs are in each node (what you call "groups" I suppose), e.g.:

$ lscpu
....
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15

Is Windows different? Even on Linux, though, I'm not sure how to express that in terms of API because we're dealing with 2 different types (int and list). That suggests a separate function would perhaps be more appropriate. Maybe:

>>> psutil.cpu_count(kind="numa_nodes")  # maybe not necessary at all?
2
>>> psutil.cpu_numa_nodes()
{0: [0, 2, 4, 6, 8, 10, 12, 14], 1: [1, 3, 5, 7, 9, 11, 13, 15]}

CC-ing @amanusk just in case he wants to chime in.

dbwiddis commented 3 years ago

Is Windows different?

Windows has the additional complication of processor groups, which complicate the node numbering. You can have processor numbers 0-63 on group 0, and 0-63 on group 1, for example. When combining with NUMA nodes, the numbering for OS lookup in the counters, etc. is tied to the NUMA node, not the processor number, so each of NUMA nodes 0 thru 3 would have processors 0-31, for example.

See https://github.com/oshi/oshi/issues/1373 for some background.

The canonical (?) enumeration in Windows is GetLogicalProcessorInformationEx() which receives an array of SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX structures connecting the processors, processor groups, and NUMA nodes. The processor "numbering" is via 64-bit bitmask (per processor group) and is not guaranteed to be consecutive (e.g., a 96-core system would have 0-47 in group 0 and 0-47 in group 1), which may or may not match the numa node numbering.

ReubenM commented 2 years ago

Is Windows different? Even on Linux, though, I'm not sure how to express that in terms of API because we're dealing with 2 different types (int and list). That suggests a separate function would perhaps be more appropriate. Maybe:

>>> psutil.cpu_count(kind="numa_nodes")  # maybe not necessary at all?
2
>>> psutil.cpu_numa_nodes()
{0: [0, 2, 4, 6, 8, 10, 12, 14], 1: [1, 3, 5, 7, 9, 11, 13, 15]}

I think it would be more useful to have a more generic psutil.cpu_attributes that would allow for more than simply numa attributes to be associated with each processing unit. You could utilize the same "kind" argument for it as well to specify at what structural level the processing unit referred to exists at, since attributes to describe a physical socket will be different that for a logical core for example. This would allow exposing quite a bit of useful info that one would find from lscpu

At that point psutil.cpu_count(kind='foo') basically turns into len( psutil.cpu_attributes(kind=foo))

I wanted to add that if numa support is added, please include setting attributes for network interfaces as well to indicated which numa node they are attached to. In Linux this is found in /sys/class/net/${interface_name}/device/numa_node or /sys/devices/${pci_domain_bus_slot_path}/numa_node}

IritKoll commented 1 year ago

Hi Was there any progress with this in python psutil new releases