zhengzheng / psutil

Automatically exported from code.google.com/p/psutil
Other
0 stars 0 forks source link

Can't set CPU affinity when having more than 63 cores #443

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
The current code does not support systems with more than 63 cores since this 
triggers the following exception when setting affinity masks:

  File "utils/osutils.py", line 199, in set_affinity
    p.set_cpu_affinity(tuple(cpus))
  File "/home/ansan501/.local/lib/python2.7/site-packages/psutil/__init__.py", line 546, in set_cpu_affinity
    return self._platform_impl.set_process_cpu_affinity(cpus)
  File "/home/ansan501/.local/lib/python2.7/site-packages/psutil/_pslinux.py", line 445, in wrapper
    return fun(self, *args, **kwargs)
  File "/home/ansan501/.local/lib/python2.7/site-packages/psutil/_pslinux.py", line 820, in set_process_cpu_affinity
    _psutil_linux.set_process_cpu_affinity(self.pid, bitmask)
OverflowError: Python int too large to convert to C long

The workaround I'm using at the moment changes the interface to the C code. 
Instead of generating a bit mask in Python, it takes a sequence of CPU IDs and 
builds the cpu_set_t structure using the CPU_SET macros.

I've attached a patch that updates the Linux implementation.

Original issue reported on code.google.com by andr...@sandberg.pp.se on 31 Oct 2013 at 6:24

Attachments:

GoogleCodeExporter commented 8 years ago
Please try this and tell me if it helps:

diff --git a/psutil/_psutil_linux.c b/psutil/_psutil_linux.c
--- a/psutil/_psutil_linux.c
+++ b/psutil/_psutil_linux.c
@@ -278,7 +278,7 @@
     unsigned int len = sizeof(mask);
     long pid;

-    if (!PyArg_ParseTuple(args, "ll", &pid, &mask)) {
+    if (!PyArg_ParseTuple(args, "lk", &pid, &mask)) {
         return NULL;
     }
     if (sched_setaffinity(pid, len, (cpu_set_t *)&mask)) {

I'm also attaching a modified version of your patch fixing a couple of issues 
but I would prefer to keep calculating the bitmask in Python (as opposed to 
using macros in C) as it's easier to read/maintain. 

Original comment by g.rodola on 8 Nov 2013 at 2:24

Attachments:

GoogleCodeExporter commented 8 years ago
Up?

Original comment by g.rodola on 13 Nov 2013 at 8:54

GoogleCodeExporter commented 8 years ago
I'll try to look into it end of next week. I'm pretty busy at the moment, but 
there should be some slack by then.

Original comment by andr...@sandberg.pp.se on 13 Nov 2013 at 12:29

GoogleCodeExporter commented 8 years ago
Thanks.

Original comment by g.rodola on 13 Nov 2013 at 12:48

GoogleCodeExporter commented 8 years ago
I committed this change as of revision af898681e7be. 
I'm going to consider this closed for now. Let me know if I'm mistaken.

Original comment by g.rodola on 22 Nov 2013 at 9:30

GoogleCodeExporter commented 8 years ago
Closing.

Original comment by g.rodola on 26 Nov 2013 at 8:52

GoogleCodeExporter commented 8 years ago

Original comment by g.rodola on 26 Nov 2013 at 8:52

GoogleCodeExporter commented 8 years ago
I just got around to test it and it seems to work fine on our machine. But, 
isn't your fix going to break as soon as someone runs this on a machine with 
more than 64 threads? Besides, CPU_SET(3) is pretty clear on how cpu_set_t 
structures should be constructed: "...the data structure treated as considered 
opaque: all manipulation of CPU sets should be done via the macros described in 
this page.".

Original comment by andr...@sandberg.pp.se on 26 Nov 2013 at 3:33

GoogleCodeExporter commented 8 years ago
> But, isn't your fix going to break as soon as someone runs this on a machine 
with more than 64 threads?

Mmm... Why do you think so?

Original comment by g.rodola on 26 Nov 2013 at 4:50

GoogleCodeExporter commented 8 years ago
If I understand things correctly, the bitmask from Python is going to be 
converted to a 64-bit unsigned long, hence the 64 threads limit.

IIRC, the by default the recent Linux implementations use a 1024-bit cpu_set_t, 
but this can be extended to arbitrarily sizes by using CPU_ALLOC.

Original comment by andr...@sandberg.pp.se on 26 Nov 2013 at 4:59

GoogleCodeExporter commented 8 years ago
I think you may be right. Committed the C version in revision e0df8ab562a6.

Original comment by g.rodola on 3 Dec 2013 at 8:51