Daetalus / unladen-swallow

Automatically exported from code.google.com/p/unladen-swallow
Other
0 stars 0 forks source link

Improve oprofile support #63

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
OProfile has support for attributing samples to particular functions and 
lines of code, even when that code is generated by a JIT compiler. This is 
described at  http://oprofile.sourceforge.net/doc/devel/jit-interface.html. 
(The "agent library" described there is designed to work with 
http://java.sun.com/j2se/1.5.0/docs/guide/jvmti/jvmti.html.  An implementation 
is at 
http://oprofile.cvs.sourceforge.net/viewvc/oprofile/oprofile/agents/jvmti/li
bjvmti_oprofile.c?view=markup.

I've got a very preliminary patch to do this at 
http://codereview.appspot.com/79049, but I need to fix that up so I can 
submit it to mainline LLVM.

Original issue reported on code.google.com by jyass...@gmail.com on 22 Jun 2009 at 6:21

GoogleCodeExporter commented 9 years ago
I'm using the attached file to test oprofile with the JIT.

Original comment by jyass...@gmail.com on 22 Jun 2009 at 6:22

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by collinw on 13 Jul 2009 at 5:50

GoogleCodeExporter commented 9 years ago
Jeffrey, can you update the status of this? This is in oprofile, right? Which 
version?

Original comment by collinw on 28 Aug 2009 at 10:31

GoogleCodeExporter commented 9 years ago
OProfile 0.9.5 should work when you pass --enable-oprofile=<oprofile-prefix> to
configure. We've merged the relevant patches from LLVM. I haven't tested or
documented this in Python yet though.

Original comment by jyass...@gmail.com on 26 Sep 2009 at 12:06

GoogleCodeExporter commented 9 years ago
Issue 99 has been merged into this issue.

Original comment by collinw on 17 Dec 2009 at 12:28

GoogleCodeExporter commented 9 years ago
We still need a slight patch to Unladen to make the OProfile support work well. 
We also 
need to patch up the build system to pass LLVM's full flag set through to 
embedding 
apps.

Original comment by collinw on 17 Dec 2009 at 12:29

GoogleCodeExporter commented 9 years ago

Original comment by collinw on 6 Jan 2010 at 11:42

GoogleCodeExporter commented 9 years ago
This was previously blocked on using llvm-config to obtain the necessary flags; 
that 
support was added in r981.

Original comment by collinw on 11 Jan 2010 at 10:29

GoogleCodeExporter commented 9 years ago

Original comment by collinw on 11 Jan 2010 at 10:29

GoogleCodeExporter commented 9 years ago
Basic oprofile support was added in r986:

$ sudo opcontrol --reset
Signalling daemon... done
$ sudo opcontrol --start-daemon; sudo opcontrol --start;
PYTHONPATH=../tests/lib/django/ ./python ../tests/performance/bm_django.py -n
100; sudo opcontrol --stop

$ opreport -l ./python | less
CPU: Core 2, speed 1600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask
of 0x00 (Unhalted core cycles) count 
100000
samples  %        image name               symbol name
79589     4.2329  python                   PyString_FromFormatV
62971     3.3491  python                   PyEval_EvalCodeEx
62713     3.3354  python                   tupledealloc
57071     3.0353  python                   _PyEval_CallFunction
50009     2.6597  24532.jo                 #u#force_unicode
47468     2.5246  python                   PyUnicodeUCS2_Decode
45829     2.4374  python                   PyFrame_New
45173     2.4025  python                   lookdict_string
43082     2.2913  python                   PyType_IsSubtype
39763     2.1148  24532.jo                 #u#render5
38145     2.0287  python                   _PyType_Lookup
37643     2.0020  python                   PyObject_GC_UnTrack
37105     1.9734  python                   frame_dealloc
36849     1.9598  python                   PyEval_EvalFrame
35630     1.8950  24532.jo                 #u#resolve
33313     1.7717  python                   PyObject_IsInstance
33208     1.7662  python                   PyDict_GetItem
33168     1.7640  python                   PyTuple_New
30458     1.6199  python                   PyCFunction_NewEx
29769     1.5833  python                   PyObject_Malloc
29710     1.5801  python                   PyString_FromString
27589     1.4673  python                   PyObject_Call
27266     1.4501  python                   vgetargskeywords
26578     1.4135  python                   PyObject_GenericGetAttr
24585     1.3075  python                   meth_dealloc
22249     1.1833  python                   _PyString_Eq
22123     1.1766  python                   PyObject_CallFunctionObjArgs
22034     1.1719  python                   PyCFunction_Call
20037     1.0657  python                   convertsimple
19574     1.0410  24532.jo                 #u#render2
17984     0.9565  python                   type_getattro
17503     0.9309  python                   PyObject_Free
17117     0.9104  python                   PyObject_GetAttr
17031     0.9058  python                   recursive_isinstance
16243     0.8639  python                   PyUnicodeUCS2_DecodeASCII
15946     0.8481  24532.jo                 #u#escape3
15675     0.8337  python                   string_hash
15428     0.8205  24532.jo                 #u#mark_safe
14068     0.7482  python                   _PyEval_CallFunctionVarKw
13938     0.7413  24532.jo                 #u#_resolve_lookup
13541     0.7202  24532.jo                 #u#_dec
13477     0.7168  python                   listiter_next
12844     0.6831  24532.jo                 #u#mark_for_escaping
12046     0.6407  24532.jo                 #u#resolve1
10972     0.5835  python                   getset_get
9731      0.5175  24532.jo                 #u#render4
9385      0.4991  24532.jo                 #u#wrapper
9285      0.4938  python                  
mark_called_and_maybe_compile(PyCodeObject*, _frame*)
8919      0.4744  python                   PyImport_ImportModuleLevel
8896      0.4731  python                   load_next
8587      0.4567  python                   builtin_isinstance
8515      0.4529  python                   PyMethod_New
8318      0.4424  python                   _PyUnicode_New
8228      0.4376  python                   function_call
8124      0.4321  python                   instancemethod_dealloc
8085      0.4300  24532.jo                 #u#escape
7913      0.4209  python                   PyBool_FromLong

Original comment by collinw on 12 Jan 2010 at 7:48

GoogleCodeExporter commented 9 years ago
Now that we have the bare-bones oprofile support, we should make it better:

- We need more meaningful function names: #u#render4 does not cut it. Module 
name and class name should probably figure in here.
- We should support opreport's -g option. It currently prints

$ opreport -g -l ./python | head -n 15
Overflow stats not available
CPU: Core 2, speed 1600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (Unhalted core cycles) count 100000
samples  %        linenr info                 image name               symbol 
name
77638     4.1247  stringobject.c:164          python                   
PyString_FromFormatV
66563     3.5363  eval.cc:3078                python                   
PyEval_EvalCodeEx
63566     3.3771  tupleobject.c:162           python                   
tupledealloc
57153     3.0364  eval.cc:4237                python                   
_PyEval_CallFunction
49921     2.6522  (no location information)   1320.jo                  
#u#force_unicode
47317     2.5138  unicodeobject.c:1127        python                   
PyUnicodeUCS2_Decode
42718     2.2695  dictobject.c:392            python                   
lookdict_string
42197     2.2418  frameobject.c:581           python                   
PyFrame_New
41919     2.2271  typeobject.c:1163           python                   
PyType_IsSubtype
40159     2.1336  frameobject.c:408           python                   
frame_dealloc
39234     2.0844  (no location information)   1320.jo                  
#u#render5
38034     2.0207  gcmodule.c:1390             python                   
PyObject_GC_UnTrack

We should turn those "(no location information)" strings into module names and 
line numbers.

- Find a way to support cumulative timings, ideally via oprofile or, if 
necessary, some other profiling tool.

And of course, all this needs to be documented in the wiki.

Original comment by collinw on 12 Jan 2010 at 8:04

GoogleCodeExporter commented 9 years ago
To recap what I said over email in the bug tracker:

There is the __module__ attribute hanging off of the function
object, but that isn't the same as the code object.  There could
conceivably be two function objects with different values for
func_module using the same code object, and we don't even have access
to the function object from the code object.

There are also the co_filename and co_firstlineno attributes of code
objects, which would at least be useful for tracking down the original
definition.  I think it might be best to take the basename of the
filename, the line number, and the function name, and join them with
colons and make that the new function name.

That basename(co_filename) : co_firstlineno info sounds like it would be 
exactly what
we want to use for the line number information.

Original comment by reid.kle...@gmail.com on 12 Jan 2010 at 10:43

GoogleCodeExporter commented 9 years ago
Possibly useful tool to mention in profiling guide: 
http://code.google.com/p/jrfonseca/wiki/Gprof2Dot

Original comment by collinw on 12 Jan 2010 at 11:42

GoogleCodeExporter commented 9 years ago
yep, Gprof2Dot is great :)

Original comment by khame...@gmail.com on 13 Jan 2010 at 11:12

GoogleCodeExporter commented 9 years ago
Turns out that I put debug information behind `python -g` and then forgot about 
it. 
When you pass that while profiling, you get filenames and line numbers. Oops.

I've added a wiki page at 
http://code.google.com/p/unladen-swallow/wiki/UsingOProfile 
describing how to use OProfile with Unladen.

Original comment by jyass...@gmail.com on 1 Feb 2010 at 11:51

GoogleCodeExporter commented 9 years ago
The -g option was removed in r1073; it is no longer required for oProfile 
integration.

Remaining issues:
- Improve the names shown in the oProfile output. Currently these look like 
"#u#render5", which sucks. We should drop the "#u#" prefix down to just "#" (to 
make them not collide with C functions). We should also include the class name 
for 
methods; that "render5" name is actually the fifth Python function named 
"render". 
These should be "#MyClass.render" instead.

This should be low-hanging fruit for a Linux user with oProfile installed. 
Remember 
to configure LLVM with --with-oprofile=$PREFIX.

Original comment by collinw on 21 Feb 2010 at 6:24

GoogleCodeExporter commented 9 years ago

Original comment by collinw on 21 Feb 2010 at 6:25

GoogleCodeExporter commented 9 years ago
An initial attempt at improving the names is here: 
http://codereview.appspot.com/217105/show

Original comment by elliotmu...@gmail.com on 23 Feb 2010 at 10:38

GoogleCodeExporter commented 9 years ago
r1157 simplified JIT function names from #u#foo to #foo. Thanks for the patch, 
Elliot!

Original comment by collinw on 21 May 2010 at 11:35