kiranvizru / psutil

Automatically exported from code.google.com/p/psutil
Other
0 stars 0 forks source link

task id / pthread_self() id #186

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
On Linux and possible other platforms with /proc interface, 
psutil.get_threads() returns thread information with the kernel's task id 
(TID). However Python threads are using a different type of identifier from 
pthread_self(). 

I'm interested in getting per thread CPU utilization. psutil gives me that 
information but I can't map the data to Python threads because POSIX thread ids 
can't be mapped to TIDs. At least I haven't found a simple way yet.

The attached patch implements gettid() (a Linux specific syscall) for psutil. 
With gettid() a thread can get its own CPU utilization.

Original issue reported on code.google.com by tiran79 on 12 Jul 2011 at 10:28

Attachments:

GoogleCodeExporter commented 9 years ago
How do you intend to map threads exactly? 
threading.Thread.get_ident [1] doc says:

> Return the ‘thread identifier’ of the current thread. This is a nonzero 
integer. 
> Its value has no direct meaning; it is intended as a magic cookie to be used 
e.g. 
> to index a dictionary of thread-specific data.

...hence I'm not sure you can use it to map python threads.

[1] http://docs.python.org/library/thread.html#thread.get_ident

Original comment by g.rodola on 12 Jul 2011 at 11:38

GoogleCodeExporter commented 9 years ago
Python uses pthread_self() as thread identifier on Linux and other pthread 
platforms. On Linux pthreads are build on top of clone(). Cloned processes 
share the same PID but have a different TID (task id). The /proc kernel 
interface just exposes the low level kernel tasks and TIDs but not the pthread 
identifier. The __NR_gettid syscall returns the thread local TID.

On Windows, psutil's get_threads() and Python's threading API are using the 
same thread idents. thread_nt.h implements PyThread_get_thread_ident() with  
GetCurrentThreadId().

>>> p = psutil.Process(os.getpid())
>>> p.get_threads()
[thread(id=548, user_time=0.6875, system_time=0.1875), thread(id=2656, 
user_time=0.0, system_time=0.0)]
>>> threading.enumerate()
[<Thread(Thread-1, started daemon 2656)>, <_MainThread(MainThread, started 
548)>]

I can't test psutil on other platforms.

We have hooks in our application that are run inside the thread whenever a 
thread starts and terminates. I'm working on a PEP with a similar interface for 
Python. In the meantime people could monkey patch 
threading.Thread.__bootstrap(), too. I'd like to have gettid (or similar) in 
psutil because I find it tedious to have a C extension module just for the one 
function. ctypes isn't a good option because the __NR_gettid syscall number is 
architecture specific. X86_64 and X86 have different numbers; ARM, PPC etc. too.

Original comment by tiran79 on 13 Jul 2011 at 12:06

GoogleCodeExporter commented 9 years ago
AFAICT both GetCurrentThreadId() and syscall( __NR_gettid ) refer to the 
current process (os.getpid()), hence cannot be used in Process class which is 
supposed to be used with *any* process/pid.

Also, it's not clear to me what would you achieve by exposing gettid() alone. 
Could you provide a pratical example on how you would retrieve the CPU times of 
a thread by using gettid()/psutil?

Perhaps I'm misunderstanding you but I have a feeling this is not in the real 
of problems which should be dealt with by base psutil.

Original comment by g.rodola on 13 Jul 2011 at 12:51

GoogleCodeExporter commented 9 years ago
Ah, you didn't get my initial use case. I'm sorry for the misunderstanding

psutil 0.3 gives me detailed CPU usage information for each thread of a 
process. I'd like to map the data to Python threads for the current Python 
process in order to find CPU intensive threads. All threads in our application 
have meaningful names.

On Windows it's trivial to map the CPU usage information to Python threads 
because psutil's thread information and threading.enumerate() share equal 
thread identifiers. For Linux I need the information from gettid() for each 
thread.

Practical example:
On Linux I monkey patch the threading.Thread class so that every thread stores 
its TID in the threading.Thread instance. With the additional information I can 
now map Python's threading.Threads to psutil's thread infos.

All except gettid can be implemented in Python easily. Therefor I suggest that 
psutil implements a function that returns the get_threads() specific thread id 
for each platform. On Linux it's gettid, on Windows it's thread.get_ident(). I 
don't have access to BSD to test it there.

Original comment by tiran79 on 13 Jul 2011 at 1:38

GoogleCodeExporter commented 9 years ago
What about processes != os.getpid()?

Original comment by g.rodola on 13 Jul 2011 at 1:43

GoogleCodeExporter commented 9 years ago
My proposal and use case is restricted to introspection of the current process.

For other processes there isn't a (simple) way to access Python's thread 
metadata from threading.enumerate(), too.

Original comment by tiran79 on 13 Jul 2011 at 1:51

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Then it's not something which is up to psutil, imo.

Original comment by g.rodola on 13 Jul 2011 at 2:15

GoogleCodeExporter commented 9 years ago

Original comment by g.rodola on 29 Jul 2011 at 2:08