DavidMorre / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Radically different CPU profiling results for identical code after upgrade from 1.4 to 1.5 #212

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hello hello! 

Today I upgraded to perftools 1.5 and found that the CPU profiling results 
for a routine of mine looked radically different than they had with perftools 
1.4. I'm using an x86_64 machine running Ubuntu 9.10.

The distribution of line-by-line samples generated with --list by the two 
profiler versions was very different, and the distribution of time in the 
basic --text output also changed dramatically. Also, it became much 
harder to interpret the --text output; human-readable names like 
ATL_saxpy_xp1yp1aXbX (a BLAS call) that were provided by 1.4 become 
addresses like 0x00007f8bf2768f94 with version 1.5. 

I noticed that profiler_unittest.sh fails during make check for version 1.5, 
even when I run make check multiple times on an unloaded machine as 
instructed. All tests passed for version 1.4.

Which set of profiling results should I believe? And is there an easy way to 
get human-readable names back?

Thanks,
Zak

Original issue reported on code.google.com by zst...@gmail.com on 28 Jan 2010 at 5:28

GoogleCodeExporter commented 9 years ago
Is it possible that you compiled them differently -- perhaps one with
--enable-frame-pointers and one using libunwind?  I think that would account 
for the
difference.

Also, I forget when we added inline support to pprof, but it may have been 
between
perftools 1.4 and perftools 1.5.  That may be affecting the results somehow.

One thing to try, if you can, is to generate the CPU profile using perftools 
1.4, and
then run the pprof in perftools 1.5.  (The other way around may work as well, 
but I'm
not sure older pprof can support the newer cpu profile.  Feel free to give it a 
try
though.)  That may give us an idea if the difference is with the profiling code 
or
(more likely) with pprof.

Original comment by csilv...@gmail.com on 28 Jan 2010 at 5:39

GoogleCodeExporter commented 9 years ago
Great call -- it looks like this is an issue with pprof. Version 1.4 of pprof 
displays the 
results I expect even for a CPU profile generated with version 1.5 of 
perftools, whereas 
version 1.5 of pprof displays similar strange results for CPU profiles 
generated with 
either version 1.4 or 1.5 of perftools.

Original comment by zst...@gmail.com on 28 Jan 2010 at 6:18

GoogleCodeExporter commented 9 years ago
Cool -- I'll try to find time to look into this next week.  In the meantime, try
using the older pprof for your work. :-)

Original comment by csilv...@gmail.com on 28 Jan 2010 at 5:04

GoogleCodeExporter commented 9 years ago
Another experiment you can do in the meantime: run 'pprof --raw' with pprof 
1.4, and
then read back the results with pprof 1.5.  (And maybe try other way around as 
well?)
 You should still see the same weird behavior.  If so, can you attach the 'pprof
--raw' output to this bug report?  Then I can see if I can reproduce it.

Original comment by csilv...@gmail.com on 28 Jan 2010 at 5:14

GoogleCodeExporter commented 9 years ago
Hmm . . . it looks like both pprof 1.4 and 1.5 render the pprof 1.4 raw output
correctly, and both render the pprof 1.5 raw output incorrectly! Curiouser and
curiouser. I have attached both raw outputs; please let me know if any further
experiments would help.

I'm more than happy to stick with perftools 1.4 for the moment. Thank you for 
your help!

Original comment by zst...@gmail.com on 29 Jan 2010 at 10:23

Attachments:

GoogleCodeExporter commented 9 years ago
OK, the data you report points to incorrect symbolizing (the ---symbol output 
at the
top of the raw files) in perftools 1.5.  The symbolize data is created by pprof 
(or
pprof --raw).  So what you're seeing is whenever pprof 1.5 (either --raw or not
--raw) is responsible for creating the symbol maps, you're getting the wrong 
data. 
Whenever pprof 1.4 is responsible, everything is fine.

Now we just need to figure out why.  Can you try running modifying pprof1.5 by
setting $debug to 1 in sub MapToSymbols, and then running pprof1.5--raw --debug 
[...]
and including that output in the bug report?

If you want to poke around yourself, feel free.  If you look at version_1.5.raw,
you'll see that we are not successful at mapping addresses to symbols: we see 
lots of
lines like
   0x000000000041d6e6 0x000000000041d6e6
rather than
   0x00007f8bf3109c69 c_eps_and_grad

I think that's the first thing to figure out.  You'll want to instrument sub
MapToSymbols, to try to figure out the exactly addr2line (or nm) call it's 
making,
and what results it's getting.

My guess is that you're using addr2line -i, and it's giving weird results for 
you,
but that's just because this is the partr of pprof that's changed the most 
since 1.4.
 Let me know what you find!

Original comment by csilv...@gmail.com on 3 Feb 2010 at 11:42

GoogleCodeExporter commented 9 years ago
Hi Craig,

I learned a little Perl to play with pprof, and you're right that something 
strange
is happening in MapToSymbols. The following two code changes appear to solve the
problem for me:

1. Redirect MapToSymbols to MapSymbolsWithNM immediately after the "Ignore empty
binaries" conditional. 

2. Replace the regular expression in ParseLibraries on line 3372 . . .

/^($h)-($h)\s+..x.\s+($h)\s+\S+:\S+\s+\d+\s+(\S+\.(so|dll|dylib|bundle)((\.\d+)+
\w*)?)$/i

. . . with the regular expression from version 1.4 of pprof:

/^($h)-($h)\s+..x.\s+($h)\s+\S+:\S+\s+\d+\s+(\S+\.(so|dll|dylib)(\.\d+)*\w*)/i

It looks like the new regexp in version 1.5 overlooks /usr/lib/libcblas.so.3gf 
and
/usr/lib/libatlas.so.3gf, which is why the function names from those libraries 
go
missing in my profiling results.

Thanks for pointing me in the right direction!

Zak

Original comment by zst...@gmail.com on 11 Feb 2010 at 3:15

GoogleCodeExporter commented 9 years ago
I don't understand point 2 -- it looks liek the 1.5 regexp should match 
libatlas.so.3gf just fine.  I did this:
   perl -le 'print "yes" if "/usr/lib/libatlas.so.3gf" =~ 
/(\S+\.(so|dll|dylib|bundle)((\.\d+)+\w*)?)$/i'

and it printed "yes" like expected.  What happens if you don't do point 2, does 
it 
fail to work?  I wonder why that would be.

As for point 1, it seems like this is a problem with addr2line on your system.

Can you type 'addr2line --help' and attach the output here?  You may also want 
to go 
a bit more down the function and put in printfs in various places to see what 
the 
function is doing, and where it's going wrong for you.

But maybe best to do one thing at a time -- let's figure out point 2 first, and 
once 
that's working, we can attack point 1.

Original comment by csilv...@gmail.com on 18 Feb 2010 at 11:20

GoogleCodeExporter commented 9 years ago
Good questions; I did try each modification independently on a clean copy and 
found
that both were necessary, but I'll double-check that. I found the library 
inclusion
issue by printing the contents of $libs at various points in the 1.4 and 1.5 
scripts
and discovering differences, which led me to the regular expression, but 
perhaps I am
mistaken.

For the moment, here are the help contents and version of addr2line:

addr2line --help:

Usage: addr2line [option(s)] [addr(s)]
 Convert addresses into line number/file name pairs.
 If no addresses are specified on the command line, they will be read from stdin
 The options are:
  @<file>                Read options from <file>
  -b --target=<bfdname>  Set the binary file format
  -e --exe=<executable>  Set the input file name (default is a.out)
  -i --inlines           Unwind inlined functions
  -j --section=<name>    Read section-relative offsets instead of addresses
  -s --basenames         Strip directory names
  -f --functions         Show function names
  -C --demangle[=style]  Demangle function names
  -h --help              Display this information
  -v --version           Display the program's version

addr2line: supported targets: elf64-x86-64 elf32-i386 a.out-i386-linux pei-i386
pei-x86-64 elf64-l1om elf64-little elf64-big elf32-little elf32-big plugin srec
symbolsrec verilog tekhex binary ihex
Report bugs to <http://www.sourceware.org/bugzilla/>

addr2line --v:

GNU addr2line (GNU Binutils for Ubuntu) 2.20
Copyright 2009 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.

Original comment by zst...@gmail.com on 19 Feb 2010 at 3:58

GoogleCodeExporter commented 9 years ago
Any more word on this?  I'd like to resolve the library-regexp issue, at least, 
since 
it seems pretty straightforward, and then work on the other issue.

Original comment by csilv...@gmail.com on 10 Mar 2010 at 6:57

GoogleCodeExporter commented 9 years ago
Sorry for the delay; I have been up against a couple of publication deadlines. 
I just 
confirmed that the following libraries are left out by pprof 1.5 and are 
included when 
the regexp is reverted to the one from version 1.4:

/usr/lib/libatlas.so.3gf
/usr/lib/libcblas.so.3gf
/usr/lib/atlas/liblapack.so.3gf
/usr/lib/atlas/libblas.so.3gf

Here's why: the regexp doesn't receive input of the form 
"/usr/lib/libatlas.so.3gf"; it 
receives an entire line like the following:

7f8bf2353000-7f8bf2acf000 r-xp 00000000 00:00 7389375     
/usr/lib/libatlas.so.3gf.0

The attached script demonstrates that this line matches the version 1.4 regexp 
but 
not the version 1.5 regexp. Removing the '$' from the end of the 1.5 regexp 
allows it 
to match this particular line.

I have also confirmed that this change isn't sufficient to produce sensible 
profiling 
output, and I haven't had time to look into the addr2line issue yet. Hopefully 
this 
resolves one of at least two mysteries, though!

Zak

Original comment by zst...@gmail.com on 10 Mar 2010 at 9:28

Attachments:

GoogleCodeExporter commented 9 years ago
OK, try replacing the regexp in pprof with this one:
/^($h)-
($h)\s+..x.\s+($h)\s+\S+:\S+\s+\d+\s+(\S+\.(so|dll|dylib|bundle)((\.\d+)+\w*(\.\
d+){0,
3})?)$/i

Getting rid of the $ is too permissive, unfortunately -- it lets non-libraries 
in.  
This should work though.

Original comment by csilv...@gmail.com on 11 Mar 2010 at 9:31

GoogleCodeExporter commented 9 years ago
The new regexp seems to work! When I combine it with the MapSymbolsWithNM 
workaround,
the final profiling output has the correct labels, although it differs from the 
pprof
1.4 output somewhat.

I have attached a comparison of the pprof 1.4 output with the output from the
modified version of pprof 1.5 along with the patch to pprof that I used.

Original comment by zst...@gmail.com on 11 Mar 2010 at 10:05

Attachments:

GoogleCodeExporter commented 9 years ago
Right, those differences are expected due to your MapSymbolsWithNM workaround.  
Once 
that part is resolved, the two profiles should be more similar.

Original comment by csilv...@gmail.com on 11 Mar 2010 at 11:24

GoogleCodeExporter commented 9 years ago
This should be fixed in perftools 1.6, just released.  I still don't know if 
you need the MapSymbolsWithNM workaround.  I'm closing the bug, but please 
reopen if need be.

Original comment by csilv...@gmail.com on 5 Aug 2010 at 8:47

GoogleCodeExporter commented 9 years ago
Thanks very much!

Original comment by zst...@gmail.com on 5 Aug 2010 at 9:14