j40951 / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

[enhancement request] pprof: correctly combine profiles with different SO load addresses #251

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
pprof seems to assume that the shared libraries are always loaded at the same 
address when combining profiles. This is not necessarily true when running the 
same program on two distinct but identical machines. I'm not sure if it's even 
always true when running the same program twice on the same machine.

In the listings below you can see that the individual profiles look OK, but the 
combined profile has unresolved symbols.

opt70:uru$ pprof covrpt covrpt.4 --text --cum | head -10
Removing _Unwind_GetCFA from all stack traces.
Total: 115 samples
       0   0.0%   0.0%       60  52.2% zzz::Executable::operator
       0   0.0%   0.0%       60  52.2% zzz::covrpt::CovRptExecutable::run
       0   0.0%   0.0%       60  52.2% gnu_get_libc_version
       0   0.0%   0.0%       60  52.2% main
       0   0.0%   0.0%       29  25.2% zzz::covrpt::CovRptExecutable::processChunk
       0   0.0%   0.0%       29  25.2% zzz::covrpt::CovRptExecutable::processRange
       0   0.0%   0.0%       18  15.7% zzz::covrpt::CalledCoverageCounter::countCoverageImpl
       0   0.0%   0.0%       18  15.7% MappingCovModeler
       0   0.0%   0.0%       13  11.3% Collection
opt70:uru$ pprof covrpt covrpt.2 --text --cum | head -10
Removing _Unwind_GetCFA from all stack traces.
Total: 116 samples
       0   0.0%   0.0%       64  55.2% zzz::Executable::operator
       0   0.0%   0.0%       64  55.2% gnu_get_libc_version
       0   0.0%   0.0%       64  55.2% main
       0   0.0%   0.0%       63  54.3% zzz::covrpt::CovRptExecutable::run
       0   0.0%   0.0%       38  32.8% zzz::covrpt::CovRptExecutable::processRange
       0   0.0%   0.0%       37  31.9% zzz::covrpt::CovRptExecutable::processChunk
       0   0.0%   0.0%       22  19.0% zzz::covrpt::CalledCoverageCounter::countCoverageImpl
       0   0.0%   0.0%       19  16.4% zzz::covrpt::CalledCoverageCounter::countCoverage
       0   0.0%   0.0%       15  12.9% MappingCovModeler
opt70:uru$ pprof covrpt covrpt.2 covrpt.4  --text --cum | head -10
Fetching 2 profiles, Be patient...
Total: 231 samples
       0   0.0%   0.0%      124  53.7% gnu_get_libc_version
       0   0.0%   0.0%      124  53.7% main
       0   0.0%   0.0%      123  53.2% zzz::covrpt::CovRptExecutable::run
       0   0.0%   0.0%      116  50.2% _Unwind_GetCFA
       0   0.0%   0.0%      115  49.8% 0x00002ac87220e4bf
       0   0.0%   0.0%       67  29.0% zzz::covrpt::CovRptExecutable::processRange
       0   0.0%   0.0%       66  28.6% zzz::covrpt::CovRptExecutable::processChunk
       0   0.0%   0.0%       64  27.7% zzz::Executable::operator
       0   0.0%   0.0%       60  26.0% 0x00002ac871424260

Original issue reported on code.google.com by igor.n.n...@gmail.com on 11 Jun 2010 at 10:11

GoogleCodeExporter commented 9 years ago
Hmm, what are 0x00002ac87220e4bf and 0x00002ac871424260, do you know?  It's 
possible that they're legitimately unresolved symbols, that show up in the 
profile diff but not in either individual profile.  It shouldn't matter where 
the so's are loaded, from the point of view of the heap checker.

And indeed, I took a single profile, and ran pprof <binary> <profile> on two 
different machines.  I tested using ldd that each machine loaded the binary's 
so's in two different locations, but the pprof -text output was the same.  So I 
suspect something else is going on here.

Original comment by csilv...@gmail.com on 11 Jun 2010 at 10:23

GoogleCodeExporter commented 9 years ago
According to the map stored in covrpt.4, these two addresses are from:

2ac872200000-2ac872216000 r-xp 00000000 00:00 425022      
/lib64/libpthread-2.5.so
2ac8713fa000-2ac871454000 r-xp 00000000 00:00 5816869536  
/<deleted>/libexecutable.so

the second library is the one zzz::Executable::operator() is coming from.

Looking at pprof (I'm using version 1.5 BTW), I'm not sure how it can handle 
different SO loading addresses. It aggregates stack traces as raw offsets into 
address space, but the object map always comes from the first profile on the 
command line.

Original comment by igor.n.n...@gmail.com on 11 Jun 2010 at 11:50

GoogleCodeExporter commented 9 years ago
Ah, I see.  That does indeed look like a bug.  If multiple profiles are 
specified, they should each use their own object map.

Do you feel up for providing a patch to fix it?  If not, I'll take a look when 
I can, but it may be a little while.

Original comment by csilv...@gmail.com on 14 Jun 2010 at 7:31

GoogleCodeExporter commented 9 years ago
Unfortunately, I can sort of read perl, but not write in it; also, I think perl 
might be a bit slow for my use case anyway: I need to add up ~16,000 separate 
profiles, each ~10MB in size. I wrote a quick and dirty aggregator in C++ for 
now, so there's no urgent need for a fix. Just thought it might save some 
confusion for others if I submitted the bug.

Original comment by igor.n.n...@gmail.com on 14 Jun 2010 at 7:56

GoogleCodeExporter commented 9 years ago
OK, thanks for the info.  I'll look into fixing this up when I get a chance 
then.

Original comment by csilv...@gmail.com on 14 Jun 2010 at 8:59

GoogleCodeExporter commented 9 years ago
I'm afraid I'm not going to get a chance to fix this before I pass off 
ownership of perftools, so I retract comment #5. :-)  I'm leaving the bug open 
if someone else wants to take a crack at it, though.

Original comment by csilv...@gmail.com on 25 Jan 2012 at 7:21

GoogleCodeExporter commented 9 years ago
Curious wht this patch would look like as I am not fully grasping the 
conversation above. Can you summarize and perhaps detail what the resulting 
patch would look like? How to test?  

Original comment by chapp...@gmail.com on 4 May 2012 at 5:26

GoogleCodeExporter commented 9 years ago
Ping?

Original comment by chapp...@gmail.com on 11 Mar 2013 at 2:29

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I think this may be related with address space layout randomization: 
http://en.wikipedia.org/wiki/Address_space_layout_randomization
So on recent linux kernels the symbol addresses can change even when running 
the same program on the same machine.

Original comment by vitor.s....@gmail.com on 23 Dec 2014 at 5:24