cuitao2046 / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

in pprof: Would like to distinguish C++ overloaded functions, instead of merging them #271

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Make a C++ program where func(int) calls func(float)
2. run cpuprofile, and pprof --callgrind
3. in kcachegrind it shows a cycle where func calls itself

What is the expected output? What do you see instead?
I'd like to see func(int) calls func(float), instead of seeing a call cycle.

What version of the product are you using? On what operating system?
Using version 1.5 and 1.6

Please provide any additional information below.
Fixing the callgrind result requires changing shortFunctionName (to not remove 
arguments and templates), and printCallgrind (to make pattern matches know that 
there can be spaces in function names.)  I haven't tried the other flows to 
know if allowing spaces in function names breaks them.  Perhaps this could be 
an option, although I don't know why you wouldn't want the extra information.

Original issue reported on code.google.com by CHKings...@gmail.com on 5 Oct 2010 at 5:29

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks!  I'll have the person who wrote the callgrind stuff originally, review 
this change.

In the meantime, do you mind signing the CLA for your patch?
   http://code.google.com/legal/individual-cla-v1.0.html

Original comment by csilv...@gmail.com on 8 Oct 2010 at 1:17

GoogleCodeExporter commented 9 years ago

Original comment by csilv...@gmail.com on 8 Oct 2010 at 1:20

GoogleCodeExporter commented 9 years ago
Initial feedback on the patch: the changes inside the callgrind-specific code 
are probably ok, but the change to ShortFunctionName will change the pprof 
output pretty drastically.  It will change some sort orders (since after your 
change we'll sort on return type rather than function name), cause some tabular 
output to no longer line up, etc.

I think we have two choices.  One is that you introduce a new function that 
does what you want, but is only used for callgrind output.  The other is that 
you add some logic in ShortFunctionName so it leaves in the template args *only 
when* there are two symbols in the binary that differ only in the template 
args.  Likewise, you leave in the function args only when there are two symbols 
in the binary that differ only in the function args.  I think you can always 
take out the return type: C++ programs are not allowed to overload only on 
return type.

I don't know how easy it will be to do the second option.  My guess is that the 
parts of the program that call ShortFunctionName do not have access to the full 
list of symbols.  You may need to reorganize the code to create a 
symbolname-to-short-symbolname map at program-start time, and pass it around 
everywhere it's needed.

Original comment by csilv...@gmail.com on 8 Oct 2010 at 4:10

GoogleCodeExporter commented 9 years ago
You're right about it making a large change to the output.  The hardest part is 
that many routines expect that function names do not have spaces in them, and 
keeping argument types will necessarily add spaces.  (Unless we kept mangled 
names internally and demangled for output?)

The first choice, only changing the name for callgrind output can't work.  The 
callgrind output routine gets the ShortFunctionNames as input.  If they're 
already too short, and things have been merged together, you can't unmerge them.

Removing the return types wouldn't be a bad idea.  Right now there's an 
assumption that return types have no spaces in them.  That's no longer true if 
you leave template args.  You'd have to remove non spaces, then if the first 
character was a space, you're done.  If it's a '<', you'd have to remove 
matching '<>' pairs, but you can't use a global pattern, because that would 
wreck the function name and arguments.  ShortFunctionName is sometimes used in 
sort routines (sort { sfn($B) cmp sfn($B) }), so you might want to compute the 
shortening once and cache the result.

Or even provide an option to merge or separate template / overloaded functions.

I'll keep my experimental code for a bit.  I'm not convinced that I have 
something suitable for a "product" yet.

Original comment by CHKings...@gmail.com on 8 Oct 2010 at 5:15

GoogleCodeExporter commented 9 years ago

Original comment by csilv...@gmail.com on 9 Oct 2010 at 1:55

GoogleCodeExporter commented 9 years ago
Taking a look at this again, after a long while.

Looking at all the issues, I think modifying ShortFunctionName is going to be 
too difficult to be worthwhile.

I think we'll have to make sure that every data structure that stores a 
ShortFunctionName stores the full function name as well.  That may mean a bit 
of work, but in the end callgrind input will have both a data structure with 
short-names, and a data structure with full names.  You can use whichever you 
want (or both), in the callgrind output.

html/etc output could use this as well.  They could print the short names, but 
then allow you to see the full names by hovering over them.

btw, are these 'fake' cycles that you notice (which are due to overloads) only 
in callgrind output?  Or do we also see them in, say, --gv output?

Original comment by csilv...@gmail.com on 1 Sep 2011 at 1:36

GoogleCodeExporter commented 9 years ago
I've decided on a different approach, which is just to annotate overlodaed 
functions with their addresses.  At least this way we can get separate nodes 
instead of combining them.  It is also possible to figure out which overloaded 
function is which by using nm.  I'll get this into the next release.

Original comment by csilv...@gmail.com on 18 Oct 2011 at 3:39

GoogleCodeExporter commented 9 years ago
This should be fixed in perftools 1.9, just released.

Original comment by csilv...@gmail.com on 23 Dec 2011 at 12:47

GoogleCodeExporter commented 9 years ago
Hi. I am using gperftools-1.9.1 and this issue is still not fixed. The problem 
with template code cause the callgrind graphs to be severely skewed to the 
point that they are no longer useful.
I manually merged the above patch into 1.9.1 and it produced the correct 
output. Can you only enable this for --callgrind output of pprof as suggested 
above. Thanks. 

Original comment by stoyan.p...@gmail.com on 3 Feb 2012 at 10:50

GoogleCodeExporter commented 9 years ago
We won't be putting in the patch in this CL, for the reasons outlined above, 
though of course you're welcome to use them yourself if they're useful to you!

I'd be interested in why the fix we did end up using doesn't work for you.  
Feel free to poke around if you have time and interest.

Original comment by csilv...@gmail.com on 3 Feb 2012 at 11:38