Open GoogleCodeExporter opened 9 years ago
I'm using the attached file to test oprofile with the JIT.
Original comment by jyass...@gmail.com
on 22 Jun 2009 at 6:22
Attachments:
Original comment by collinw
on 13 Jul 2009 at 5:50
Jeffrey, can you update the status of this? This is in oprofile, right? Which
version?
Original comment by collinw
on 28 Aug 2009 at 10:31
OProfile 0.9.5 should work when you pass --enable-oprofile=<oprofile-prefix> to
configure. We've merged the relevant patches from LLVM. I haven't tested or
documented this in Python yet though.
Original comment by jyass...@gmail.com
on 26 Sep 2009 at 12:06
Issue 99 has been merged into this issue.
Original comment by collinw
on 17 Dec 2009 at 12:28
We still need a slight patch to Unladen to make the OProfile support work well.
We also
need to patch up the build system to pass LLVM's full flag set through to
embedding
apps.
Original comment by collinw
on 17 Dec 2009 at 12:29
Original comment by collinw
on 6 Jan 2010 at 11:42
This was previously blocked on using llvm-config to obtain the necessary flags;
that
support was added in r981.
Original comment by collinw
on 11 Jan 2010 at 10:29
Original comment by collinw
on 11 Jan 2010 at 10:29
Basic oprofile support was added in r986:
$ sudo opcontrol --reset
Signalling daemon... done
$ sudo opcontrol --start-daemon; sudo opcontrol --start;
PYTHONPATH=../tests/lib/django/ ./python ../tests/performance/bm_django.py -n
100; sudo opcontrol --stop
$ opreport -l ./python | less
CPU: Core 2, speed 1600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask
of 0x00 (Unhalted core cycles) count
100000
samples % image name symbol name
79589 4.2329 python PyString_FromFormatV
62971 3.3491 python PyEval_EvalCodeEx
62713 3.3354 python tupledealloc
57071 3.0353 python _PyEval_CallFunction
50009 2.6597 24532.jo #u#force_unicode
47468 2.5246 python PyUnicodeUCS2_Decode
45829 2.4374 python PyFrame_New
45173 2.4025 python lookdict_string
43082 2.2913 python PyType_IsSubtype
39763 2.1148 24532.jo #u#render5
38145 2.0287 python _PyType_Lookup
37643 2.0020 python PyObject_GC_UnTrack
37105 1.9734 python frame_dealloc
36849 1.9598 python PyEval_EvalFrame
35630 1.8950 24532.jo #u#resolve
33313 1.7717 python PyObject_IsInstance
33208 1.7662 python PyDict_GetItem
33168 1.7640 python PyTuple_New
30458 1.6199 python PyCFunction_NewEx
29769 1.5833 python PyObject_Malloc
29710 1.5801 python PyString_FromString
27589 1.4673 python PyObject_Call
27266 1.4501 python vgetargskeywords
26578 1.4135 python PyObject_GenericGetAttr
24585 1.3075 python meth_dealloc
22249 1.1833 python _PyString_Eq
22123 1.1766 python PyObject_CallFunctionObjArgs
22034 1.1719 python PyCFunction_Call
20037 1.0657 python convertsimple
19574 1.0410 24532.jo #u#render2
17984 0.9565 python type_getattro
17503 0.9309 python PyObject_Free
17117 0.9104 python PyObject_GetAttr
17031 0.9058 python recursive_isinstance
16243 0.8639 python PyUnicodeUCS2_DecodeASCII
15946 0.8481 24532.jo #u#escape3
15675 0.8337 python string_hash
15428 0.8205 24532.jo #u#mark_safe
14068 0.7482 python _PyEval_CallFunctionVarKw
13938 0.7413 24532.jo #u#_resolve_lookup
13541 0.7202 24532.jo #u#_dec
13477 0.7168 python listiter_next
12844 0.6831 24532.jo #u#mark_for_escaping
12046 0.6407 24532.jo #u#resolve1
10972 0.5835 python getset_get
9731 0.5175 24532.jo #u#render4
9385 0.4991 24532.jo #u#wrapper
9285 0.4938 python
mark_called_and_maybe_compile(PyCodeObject*, _frame*)
8919 0.4744 python PyImport_ImportModuleLevel
8896 0.4731 python load_next
8587 0.4567 python builtin_isinstance
8515 0.4529 python PyMethod_New
8318 0.4424 python _PyUnicode_New
8228 0.4376 python function_call
8124 0.4321 python instancemethod_dealloc
8085 0.4300 24532.jo #u#escape
7913 0.4209 python PyBool_FromLong
Original comment by collinw
on 12 Jan 2010 at 7:48
Now that we have the bare-bones oprofile support, we should make it better:
- We need more meaningful function names: #u#render4 does not cut it. Module
name and class name should probably figure in here.
- We should support opreport's -g option. It currently prints
$ opreport -g -l ./python | head -n 15
Overflow stats not available
CPU: Core 2, speed 1600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask
of 0x00 (Unhalted core cycles) count 100000
samples % linenr info image name symbol
name
77638 4.1247 stringobject.c:164 python
PyString_FromFormatV
66563 3.5363 eval.cc:3078 python
PyEval_EvalCodeEx
63566 3.3771 tupleobject.c:162 python
tupledealloc
57153 3.0364 eval.cc:4237 python
_PyEval_CallFunction
49921 2.6522 (no location information) 1320.jo
#u#force_unicode
47317 2.5138 unicodeobject.c:1127 python
PyUnicodeUCS2_Decode
42718 2.2695 dictobject.c:392 python
lookdict_string
42197 2.2418 frameobject.c:581 python
PyFrame_New
41919 2.2271 typeobject.c:1163 python
PyType_IsSubtype
40159 2.1336 frameobject.c:408 python
frame_dealloc
39234 2.0844 (no location information) 1320.jo
#u#render5
38034 2.0207 gcmodule.c:1390 python
PyObject_GC_UnTrack
We should turn those "(no location information)" strings into module names and
line numbers.
- Find a way to support cumulative timings, ideally via oprofile or, if
necessary, some other profiling tool.
And of course, all this needs to be documented in the wiki.
Original comment by collinw
on 12 Jan 2010 at 8:04
To recap what I said over email in the bug tracker:
There is the __module__ attribute hanging off of the function
object, but that isn't the same as the code object. There could
conceivably be two function objects with different values for
func_module using the same code object, and we don't even have access
to the function object from the code object.
There are also the co_filename and co_firstlineno attributes of code
objects, which would at least be useful for tracking down the original
definition. I think it might be best to take the basename of the
filename, the line number, and the function name, and join them with
colons and make that the new function name.
That basename(co_filename) : co_firstlineno info sounds like it would be
exactly what
we want to use for the line number information.
Original comment by reid.kle...@gmail.com
on 12 Jan 2010 at 10:43
Possibly useful tool to mention in profiling guide:
http://code.google.com/p/jrfonseca/wiki/Gprof2Dot
Original comment by collinw
on 12 Jan 2010 at 11:42
yep, Gprof2Dot is great :)
Original comment by khame...@gmail.com
on 13 Jan 2010 at 11:12
Turns out that I put debug information behind `python -g` and then forgot about
it.
When you pass that while profiling, you get filenames and line numbers. Oops.
I've added a wiki page at
http://code.google.com/p/unladen-swallow/wiki/UsingOProfile
describing how to use OProfile with Unladen.
Original comment by jyass...@gmail.com
on 1 Feb 2010 at 11:51
The -g option was removed in r1073; it is no longer required for oProfile
integration.
Remaining issues:
- Improve the names shown in the oProfile output. Currently these look like
"#u#render5", which sucks. We should drop the "#u#" prefix down to just "#" (to
make them not collide with C functions). We should also include the class name
for
methods; that "render5" name is actually the fifth Python function named
"render".
These should be "#MyClass.render" instead.
This should be low-hanging fruit for a Linux user with oProfile installed.
Remember
to configure LLVM with --with-oprofile=$PREFIX.
Original comment by collinw
on 21 Feb 2010 at 6:24
Original comment by collinw
on 21 Feb 2010 at 6:25
An initial attempt at improving the names is here:
http://codereview.appspot.com/217105/show
Original comment by elliotmu...@gmail.com
on 23 Feb 2010 at 10:38
r1157 simplified JIT function names from #u#foo to #foo. Thanks for the patch,
Elliot!
Original comment by collinw
on 21 May 2010 at 11:35
Original issue reported on code.google.com by
jyass...@gmail.com
on 22 Jun 2009 at 6:21