Provide statistics per-thread

GoogleCodeExporter commented 9 years ago

Currently it's possible to choose whether threads should be profiled or not 
(which is great) and it's possible to know how much time was spent on each 
thread, but when the stats are gotten (usually with yappi.get_func_stats()), 
the stats are gotten for all the threads together...

What would be really nice (if feasible) is something as:

thread_idents = yappi.get_profiled_thread_idents()

for thread_ident in thread_idents:
  func_stats = yappi.get_func_stats(thread_ident)

With this it should be possible to keep the API backward-compatible and still 
accommodate the multiple thread scenario properly.

Original issue reported on code.google.com by fabi...@gmail.com on 22 Oct 2014 at 10:58

GoogleCodeExporter commented 9 years ago

This sounds nice but it seems very hard if not impossible to implement in all 
cases. Let me explain: we are holding a global table of function statistics. 
Every time we calculate a exit from a function, we simply accumuluate the 
timing information of the function and the thread. We get the current thread 
from the thread_id that the OS provides us. Here is the problem: A thread 
having a thread id X calls function A, and finished. A new thread having the 
same thread id calls function B, and yappi will have no way of differentating 
these threads as they share the same thread id. BTW, this problem, currently 
exists while saving/loading yappi results from files: we save some stats and 
then run yappi again and merge the stats with the saved one, and as we would 
still rely on thread-id's to be unique we will have problems, thus, I have not 
implemented saving/loading thread results as it will be impossible to 
accurately merge two different thread stats in yappi's current implementation.

Not need to say, but: of course, but I am open to new ideas.

Original comment by sum...@gmail.com on 23 Oct 2014 at 10:05

GoogleCodeExporter commented 9 years ago

Ok, I get that the problem is that the thread.ident may be reused in another 
thread... in the PyDev debugger, what I do to make sure that things are unique 
is putting some information on the threading.Thread itself to make sure things 
are unique and that I really know which thread I'm dealing with.

I.e.:
t = threading.currentThread() # Or however you get the Thread (maybe 
threading._active?)
try:
  yappi_thread_id = t.yappi_thread_id
except:
  yappi_thread_id = t.yappi_thread_id = next_thread_id()

As for accumulating, it seems the point would be that you have to accumulate 
that same table but use t.yappi_thread_id as the key for the table... As you 
can already join statistics, it should be possible to iterate over those to 
keep the same API if the thread id is not passed.

It would also fix the issue you currently have...

I can also think of another option: you could monkey-patch 
thread.start_new_thread to start a function of your own (instead of the 
Thread.run function) and then you could *know* when a thread is created/stopped 
so that you can create the mapping from the Thread.ident to your own internal 
yappy thread id and when a thread is stopped -- when your function leaves -- 
you can make sure that your own yappi thread id won't be reused even if if the 
thread ident is the same (and on the first time it's stopped, since you're in 
C, you can get the global lock, stop the Python world and build the initial 
mapping for the active threads).

I do something that monkey-patching in the PyDev debugger (but in my case just 
to get the tracing facility going):

https://github.com/fabioz/PyDev.Debugger/blob/development/pydev_monkey.py (in 
patch_thread_module -- the only thing is that you would want to save the 
original function to another name -- not _original_start_new_thread -- so that 
we don't have a conflict in the PyDev debugger/yappi).

to set the tracing facility on cases where sys.settrace/sys.setprofile wouldn't 
work (i.e.: if a thread is started directly through the thread module, the 
hooks from sys aren't called).

What do you think?

Original comment by fabi...@gmail.com on 23 Oct 2014 at 10:42

GoogleCodeExporter commented 9 years ago

Hmm, I think creating custom thread id's might work. The good part is that we 
may probably use set_context_id_callback() just for this purpose. So, hopefully 
we won't need to monkey-patch anything.

This function is called everytime to retrieve the current thread id from Python 
if the callback is set. So, it will be simple to define/implement our custom 
thread-id logic there. However, I need to think about it more as it changes the 
core a bit. Also, we need to make sure that solution only applies to 
Threading.Thread() derived threads, yappi also supports other threading 
libraries via these context_id_callback() hooks, so we need to make sure that 
there is no hardcoded logic in the C API, everything should be configured by 
the user.

The solution you suggested will probably work and it will also obsolete the 
issue 50 you opened, as all the stat information retrieved can be separated to 
threads which is, I must admit, pretty useful.

Again: I need some time to think about it, and unfortunately my full day job 
just fully saturates me on these days:).

Original comment by sum...@gmail.com on 23 Oct 2014 at 7:39

GoogleCodeExporter commented 9 years ago

sorry: I mean Issue 49 instead of Issue 50.

Original comment by sum...@gmail.com on 23 Oct 2014 at 7:42

GoogleCodeExporter commented 9 years ago

I agree with you (having this can obsolete issue 49)... and will be pretty 
useful :)

As for the time, no worries, I definitely know what you're talking about :)

Original comment by fabi...@gmail.com on 23 Oct 2014 at 11:00

GoogleCodeExporter commented 9 years ago

I have been thinking about different scenarios and it seems we have a 
problematic one:
Think about the following:

1) yappi.start()
2) do_stuff()
3) yappi.save() # save the stats
4) open a new Python instance
5) start()
6) do_stuff()
7) add() the saved stats to current ones. Thread._yappi_thread_id may match 
with 
   the previous saved thread_ids and merge will fail to differentiate the saved/running thread stats.

In fact, we will always have a chance of failure whenever we try to merge some 
stat results from 
different Python process because we will lose the uniqueness of the thread-id.

This leads us to two options:

1) We will always assume threads are "different" from each other if they are 
saved.
   No merge operation will take place on thread stats *if* they are saved. 
2) We somehow find a way to identify a "profile session" -- (use 
pid/time/machine name) combination
   to generate a unique profile_session_id. This may not be %100 working but at least something.

What do you think? 

Other than that, one more implementation detail also bothers me:
to set the current_thread._yappi_thread_id, we, of course need to get the 
current thread_id.
To accomplish this, we may use threading.currentThread() or threading[_active] 
as you have
mentioned. Trying to reach threading.currentThread() from a bootstrapping 
thread deadlocks on
some occasions (on 3rd party libs like pthreading). You can see issue 48 for 
that. If this
is called fast enough we may fail sometimes. We need to make sure that we 
retrieve 
currentThread() in a lock-free way. I have seen issues regarding this kind of 
usage.

Any other suggestion to retrieve that value? Maybe from C API, not checked...?

Original comment by sum...@gmail.com on 24 Oct 2014 at 3:44

GoogleCodeExporter commented 9 years ago

Well I think I can use PyThreadState_GetDict() and use to store my own custom 
thread_id in the actual ThreadState object. This way, I won't be calling to 
Python at all. 

So, You can omit the second part of the problem related with the 
currentThread() value.

Original comment by sum...@gmail.com on 24 Oct 2014 at 4:18

GoogleCodeExporter commented 9 years ago

We can use the obj argument in void PyEval_SetProfile

This object is returned back in the profiling callback, so this object can hold 
the thread id, or even the entire thread func stats, instead of the current 
hash table.

See https://docs.python.org/2/c-api/init.html#c.PyEval_SetProfile

Original comment by nir...@gmail.com on 26 Oct 2014 at 1:41

GoogleCodeExporter commented 9 years ago

About merging stats from different runs, we can use thread names for this, so 
if the user always run the same threads (same names), we can show per thread 
stats.

Original comment by nir...@gmail.com on 26 Oct 2014 at 1:50

GoogleCodeExporter commented 9 years ago

PyEval_SetProfile seems to be a very good candidate for holding our per thread 
data. 

About using thread names for merging stats: I would definitely do not want 
this, as user might be interested in profiling a specific run/session of a 
thread. If we merge all runs alltogether for some thread class, we might lose 
that information. And also, there is of course always a possibility that same 
thread names conflict each other.

What I think is that: showing threads "different" seems more useful/safe than 
to show them "merged" if we cannot identify the difference %100.

Original comment by sum...@gmail.com on 26 Oct 2014 at 9:25

GoogleCodeExporter commented 9 years ago

The best way would be to provide both thread name (controlled by the user) and 
thread ident (controlled by the kernel). The user should be able to choose how 
to merge the data, we should not guess.

Original comment by nir...@gmail.com on 26 Oct 2014 at 11:17

GoogleCodeExporter commented 9 years ago

User wants to merge two thread/func stats? Ok. No problem:

thread_stat1 + thread_stat2,

User will be able to simply reach any thread stat with its tid/name without 
problem and we have been supporting __add__ operation on every stat. However, 
what he/she can't do is following:

thread_stat1 - thread_stat2.

So, this again supports following point: We shall not merge any stat unless we 
are %100 percent that they are same thread. We can let the user to do this ask 
as you indicate. So, in the case of thread_name or thread_id merging: I cannot 
see any motivation as both are not %100 accurate solutions.

Original comment by sum...@gmail.com on 27 Oct 2014 at 7:12

GoogleCodeExporter commented 9 years ago

I have created a new fork https://bitbucket.org/sumerc/yappi-perthread-stats to 
start playing, I am planning to implement the functionality on this fork and 
then push back when it becomes stable enough.

Original comment by sum...@gmail.com on 27 Oct 2014 at 8:43

GoogleCodeExporter commented 9 years ago

Hi all again,

Just would like to give an update about the process/my thoughts:

So, it seems I am currently able to retrieve per-thread information from 
get_func_stats() and basic testing suggests it is stable. (We have some 
performance degradation as we constantly access current thread's dict 
internally but, working on that...)
 I have been thinking about the API to retrieve the information with following goals in mind:

- backward-compatible
- support save/load of the stats.
- have another use cases other than filtering thread id.

So, I have come up with following syntax basically(constants can be changed):

Enumerating threads in the current func stats:
yappi.get_func_stats().thread_ids() # returns a list of thread id's per profile 
session. I don't like having an extra function just to traverse thread_ids() on 
function stats but I cannot come up with a better idea.

fstats = yappi.get_func_stats()
fstats.filter("tid", TID) # syntax can be changed I can use lambda or filter() 
or django query like filtering... not sure.

The filter() function can be used for every Stat object for every key. E.g:

fstats.filter("name", "foobar")
yappi.get_thread_stats().filter("name", "_MainThread")
...etc.

I just would like to share my thoughts with you guys,  before jumping on to 
implementation.
I am planning to continue writing tests following above prototype if no 
objection/idea comes up, though.

That's all.

Original comment by sum...@gmail.com on 6 Nov 2014 at 8:32

GoogleCodeExporter commented 9 years ago

Hi, I like the idea... 

Although my use-case would by simpler (just would need to group by thread id), 
I could do it with:

tid_to_func_stats = {}
func_stats = yappi.get_func_stats()
for tid in func_stats.thread_ids():
    tid_to_func_stats[tid] = func_stats.filter("tid", tid)

So, the API seems flexible enough for me :)

I'm not sure about supporting lambdas there... if you do that you'd loose the 
possibility of making some things faster later on (say internally you'd already 
have data separated by thread id... it could be a simple lookup, but if it's a 
lambda it'd need to traverse all the entries regardless of that?)

Original comment by fabi...@gmail.com on 10 Nov 2014 at 3:36

GoogleCodeExporter commented 9 years ago

Hi again,

I feel it has become stable enough for at least beta testing. I have merged the 
code to main fork: https://bitbucket.org/sumerc/yappi. 
I have done few minor changes, the current usage is as following(for your 
use-case):

tid_to_func_stats = {}
tstats = yappi.get_thread_stats()
for tstat in yappi.get_thread_stats():
    tid_to_func_stats[tstat.id] = yappi.get_func_stats(filter={"ctx_id": tstat.id})

So, you can simply grab the latest head from, if you want to play with the API. 
I have not updated the docs, but I am sure you can understand everything from 
yappi.py file.

Thanks,

A last closing note: I still feel that it can be done better/faster, and I will 
think about it, but in the meantime, it seems just to be working fine or it is 
_fine_ enough.

Original comment by sum...@gmail.com on 18 Nov 2014 at 11:26

GoogleCodeExporter commented 9 years ago

Thanks a lot. I'll take a look at it and report back :)

Original comment by fabi...@gmail.com on 18 Nov 2014 at 11:48

GoogleCodeExporter commented 9 years ago

Just to note, I've tested the changes and so far they seem to work well for me 
(i.e.: no problems found and the API is flexible enough to get stats separated 
by thread).

Original comment by fabi...@gmail.com on 25 Nov 2014 at 12:06

GoogleCodeExporter commented 9 years ago

That is great to hear! 
I am closing this along with issue49.

Original comment by sum...@gmail.com on 26 Nov 2014 at 1:28

GoogleCodeExporter commented 9 years ago

Original comment by sum...@gmail.com on 3 Dec 2014 at 7:22

Changed state: Done

GoogleCodeExporter commented 9 years ago

Hi there, I'd like to say that I'm using this for some time already and it 
looks good.

I was wondering if there are any plans on when there'll be a new official 
release with this feature in-place (so that it can be downloaded through pip).

Thanks,

Fabio

Original comment by fabi...@gmail.com on 11 Dec 2014 at 12:48

GoogleCodeExporter commented 9 years ago

Yeah sure. I can arrange that. Just one note though: Is there any urge on this? 
I would like to run manual_tests.txt in tests/ folder which takes some time. I 
think next week I can find time to run tests/deploy a new version? How that 
sounds?

Original comment by sum...@gmail.com on 11 Dec 2014 at 7:40

GoogleCodeExporter commented 9 years ago

That sounds great...

There's no huge urgency (as I'm able to compile my version from sources without 
any issues), just that I'm doing a profiler UI: 
https://sw-brainwy.rhcloud.com/support/pydev-2014/ and yappi is one of the 
backends I'm supporting -- it's still not public, so, no rush, but when it 
becomes public I'd like to point people to install it through pip.

If you're interested in checking it, I'd be happy to provide you a license and 
access to pre-releases (in the end the crowdfunding didn't really reach the 
needed levels to make it financially viable to develop, so, it'll be closed 
source -- although freely available for Open Source software)... it's nice that 
through the UI you can connect to a running (CPython) process and start/stop 
profiling with yappi without any instrumentation through the UI (if you're 
interested, please send me an e-mail to fabiofz at gmail dot com).

Also, have you considered distributing yappi through Pypi with binary wheels? 
(i.e.: 
https://hynek.me/articles/sharing-your-labor-of-love-pypi-quick-and-dirty/) -- 
it'd be nice to have it precompiled, especially on windows (I can provide 
binaries for Python 2.7/3.4 on windows if you want)

Original comment by fabi...@gmail.com on 11 Dec 2014 at 10:39

GoogleCodeExporter commented 9 years ago

I am definitely into this project!

I have been thinking about an IDE which has a stable python profiler in it 
since I have started this project. I will send you an email and we can move 
from there, and also I have some ideas regarding using yappi as a "sampling 
profiler" with some tricks. Anyway I would like to help and will contact you.

For the other topic on binary wheels: I am not very familiar with the concept, 
so I am opening another issue for this. It would be nice to have have this kind 
of deployment, though as I am familiar some users having diffciculty on 
settings up compile env. on Windows. See Issue 52.

Original comment by sum...@gmail.com on 11 Dec 2014 at 10:56

GoogleCodeExporter commented 9 years ago

FYI: I have released a new version:0.94 which includes the per-thread stats 
fixes.

Original comment by sum...@gmail.com on 22 Dec 2014 at 10:29

bofh19 / yappi

Provide statistics per-thread #51