bloomberg / memray

Memray is a memory profiler for Python
https://bloomberg.github.io/memray/
Apache License 2.0
13.17k stars 392 forks source link

Include thread name in Memray live tracking view #561

Closed gibsondan closed 4 months ago

gibsondan commented 6 months ago

Is there an existing proposal for this?

Is your feature request related to a problem?

memray attach is incredibly useful - but for multi-threaded applications, once you've determined that a particular thread is the source of a memory issue, figuring out which thread in the application corresponds to that "Thread X of Y" number displayed in the live view is challenging.

Describe the solution you'd like

Each thread has a string name - it would be very useful if the thread name was included alongside the "Thread X of Y" output in the live view.

Alternatives you considered

We can use py-spy to map thread numbers to thread names (I think?) but it would be much more convenient to have that information already available in memray

Thanks for building an amazing tool!

pablogsal commented 6 months ago

Thanks for opening an issue @gibsondan!

We do have the thread name already collected so it should be fairly simple to pass it through the layers and display it in the TUI

godlygeek commented 6 months ago

Each thread has a string name

Can you clarify which string name you're referring to, @gibsondan ?

gibsondan commented 6 months ago

@godlygeek when you create a thread in python, you can supply a name: https://docs.python.org/3/library/threading.html#thread-objects

(Or a threadpoolexecutor can specify a prefix which is applied to all threads created in that pool)

I don't know offhand how that is translated into the operating system internals but I suspect it's through something like this: https://man7.org/linux/man-pages/man3/pthread_setname_np.3.html

py-spy, for example, includes these names in its output via commands like py-spy dump or the flamegraph produced by py-spy record:

Thread 41 (idle): "schedule_daemon_worker_7"
godlygeek commented 6 months ago

We do have access to the name set through pthread_setname_np, but we don't have access to the name set using threading.Thread.name, and setting the name on a Python Thread object doesn't automatically cause pthread_setname_np to be called with the same name (or even a leading prefix of it).

That said, it would be interesting to try to collect the name attribute set on a threading.Thread... I wonder if we're able to...