mzhaom / gperftools

Fast, multi-threaded malloc() and nifty performance analysis tools
https://code.google.com/p/gperftools/
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Patch Offering: Allow 'pprof' to save and restore symbol mappings #146

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,

We're using pprof in "remote-fetch" mode to fetch CPU, heap, and contention
profiles from remote servers.  Often, we want to allow developers to
further analyze the profile after the initial fetch (and the binary is
often unavailable on the developer workstation).  

I've attached a patch from one of our developers here at Facebook
(weichen@facebook.com).  We are currently using this in production
(although I had to clean up the patch a little, and I may have introduced a
bug -- please let me know if you find anything wrong and I will fix).

The patch does the following:

- adds a '--raw' output option to pprof to fetch from remote server and
dump the raw profile data to a local file for further analysis

- adds a '--save_sym' option to pprof to fetch the /pprof/symbols from
remote server and then save that mapping to a local file for further analysis

- adds a '--use_sym_file' option to pprof; instead of expecting a binary on
the cmdline, it instead expects a symbol mapping file (and still expects a
profile file as well).

Our workflow is like this:

# fetch once
$ pprof <host>:<port>/whatever --raw --save_sym=out.sym > out.prof
# analyze many times
$ pprof --use_sym_file ./out.sym out.prof [--text | --gif | ... ]

Original issue reported on code.google.com by mrab...@gmail.com on 15 Jun 2009 at 12:12

Attachments:

GoogleCodeExporter commented 9 years ago
Hmm, an interesting idea.  I'd like to understand the specifics of the patch a 
little
more.

It looks like --raw basically just copies $collected_profile.  But why do you 
need to
do that?  Doesn't pprof already save this raw data for you?  I see this code:
    $main::collected_profile = $real_profile;
where $real_profile is /home/pprof/something.  Why can't you just use that 
directly?
Alternately, you could set PPROF_TMPDIR before running the script, and put the
profile into another directory besides /home/pprof.

As for --save_sym, I guess you could just as easily run the 'curl' (or 'wget')
command yourself to http://yourhost/pprof/symbols (or whatever the url is), and 
store
those symbols somewhere.  pprof doesn't do any special processing for this, 
does it?

I do like the idea of allowing a symbol file in lieu of an entire executable, 
as an
arg to pprof.  But instead of requiring a --use_sym_file flag, why not just 
detect it
automatically?  You should be able to look at the file contents and see if it's 
a sym
file or not, and if so to parse it appropriately.  (One idea would be to just 
parse
the file as if it were a sym file, and if you get anything out of it -- any of 
the
regexps you ahve match -- it's probably a sym file.)

As you can tell, I'm trying to reduce the need for commandline flags, which 
pprof has
too much of already.  The more we can do automatically, or outside pprof when 
pprof
doesn't add any value, the happier it makes me. :-)

Let me know what you think.

Original comment by csilv...@gmail.com on 16 Jun 2009 at 5:02

GoogleCodeExporter commented 9 years ago
On Tue, Jun 16, 2009 at 10:03 AM, <codesite-noreply@google.com> wrote:
> It looks like --raw basically just copies $collected_profile.  But why do
> you need to
> do that?  Doesn't pprof already save this raw data for you?  I see this
> code:
>    $main::collected_profile = $real_profile;
> where $real_profile is /home/pprof/something.  Why can't you just use that
> directly?
> Alternately, you could set PPROF_TMPDIR before running the script, and put
> the
> profile into another directory besides /home/pprof.

I could use that tmp file directly, but I think it's somewhat black
magic as to what it's named.  Pprof doesn't expose (right now) what
the name will be -- it's some mangling (I imagine of host, port, time,
whatever) so after several runs of pprof it's hard to say which is the
newest file, and icky.  If you don't want a commandline arg, I can do
a PPROF_FORCE_TMPFILE_NAME env-var (with a better name :P) or
something so I can control the mangling and use that in combination
with PPROF_TMPDIR, but I think --raw is pretty clean and simple and
won't confuse people.

> As for --save_sym, I guess you could just as easily run the 'curl' (or
> 'wget')
> command yourself to http://yourhost/pprof/symbols (or whatever the url is),
> and store
> those symbols somewhere.  pprof doesn't do any special processing for this,
> does it?

pprof does do some processing: it uses HTTP POST to upload the list of
hex addresses that it needs symbols for, and /pprof/symbols on my host
(or equivalent) only returns the symbols requested.  Some binaries
might have a huge number of symbols (some from libs), and stringifying
symbols is somewhat expensive; it's a lot cleaner to enumerate the
ones you need. I am not sure it's reasonable to have a binary return
_all_ its symbols from all of its libs (including shlibs it uses).

> I do like the idea of allowing a symbol file in lieu of an entire
> executable, as an
> arg to pprof.  But instead of requiring a --use_sym_file flag, why not just
> detect it
> automatically?  You should be able to look at the file contents and see if
> it's a sym
> file or not, and if so to parse it appropriately.  (One idea would be to
> just parse
> the file as if it were a sym file, and if you get anything out of it -- any
> of the
> regexps you ahve match -- it's probably a sym file.)

Absolutely, this part I agree with -- we can just provide a "---
symbols" header similarly to how we detect contention profiles and
automatically use it if provided.

> As you can tell, I'm trying to reduce the need for commandline flags, which
> pprof has
> too much of already.  The more we can do automatically, or outside pprof
> when pprof
> doesn't add any value, the happier it makes me. :-)

Sure, I completely understand your motivations.

>
> Let me know what you think.
>

My answers inline!

Original comment by mrab...@gmail.com on 17 Jun 2009 at 6:22

GoogleCodeExporter commented 9 years ago
Hmm, when you put it that way, --raw as another output mode does make sense.

So with detecting the sym file automatically, we're just left with the 
--save_sym
flag.  I admit I still don't love it, just because pprof is kinda designed to 
have
only one type of output, and I feel we should probably keep it that way.  What 
if we
had another output mode that just dumped symbols?  --raw-symbols or some such?  
Then
you'd have to call pprof twice to do what you want to do, but maybe that's ok?

Anyway, I think a good next step is to send a revision of the patch, that does 
the
auto-detection of the raw symbol file, and does whatever you think is best for 
saving
symbols.  I'll be glad to take another look.

Original comment by csilv...@gmail.com on 18 Jun 2009 at 3:15

GoogleCodeExporter commented 9 years ago
} A final idea that I just had would be to instead combine the two files
} into one and INSERT the symbols into the file we dump with "--raw".
} In fact, since two different profiles would require two different
} symbols files (as the symbol set might differ), it makes perfect sense
} to dump a "combination file" that looks something like:
}
} --- symbols
} 0x3423403203    abc:def:foo() foo.cc:35
} 0x3423403203    abc:def:foo() foo.cc:35
} --- profile
} 2 0x3423403203 0x3423403203 0x3423403203 0x3423403203
} 19 0x3423403203 0x3423403203 0x3423403203 0x3423403203
} 35 0x3423403203 0x3423403203 0x3423403203 0x3423403203

I like this idea!  Let's see what a patch to do that looks like.

} Perhaps "--raw" should then be "--symbolized-profile" or similar.

Makes sense.

Original comment by csilv...@gmail.com on 20 Jun 2009 at 4:33

GoogleCodeExporter commented 9 years ago
I have a new patch attached, as we discussed.  The new features as follows:

* New option has been added, '--symsprof' which can only be used with remote 
fetch
and dumps symbols AND the collected profile to stdout in a new format.

The new format is formed by prefixing a special symbols section to a regular 
profile
as follows:

    --- symbols
    0x999999 ns::ns::func() file.ext:99
    0x999999 ns::ns::func() file.ext:99
    ...
    ---
    <regular format CPU, heap, contention, etc. profile follows>

Pprof, in addition, will now automatically detect when the profiles given on the
commandline contain symbols and will not require a binary file in this case.  
This
works for all types of profiles (I tested on CPU and contention).

I will continue testing this more thoroughly to insure it's bulletproof -- 
please
excuse me if any bugs are currently present, but I wanted to get early feedback 
on
whether the design is acceptable.  

Original comment by mrab...@gmail.com on 23 Jun 2009 at 3:18

Attachments:

GoogleCodeExporter commented 9 years ago
I just looked over the patch briefly, but the design seems good to me.

A few comments:
1) Why does --symprof mode only work for remote fetches?  If you give a binary 
and
profile file, can't it emit a symprof file just fine?  That might be useful in
certain situations.

2) Maybe include the binary name in the --symprof output.  Right now you say
"(unknown)" for the binary name, it looks like.

3) I'm not a huge fan of the name --symprof.  It just seems a bit obscure.  I 
think
--raw would still be fine: we're emitting the raw data we got from this profile 
(not
just the profile, but the symbol-table).  Or --raw_profile.  Or maybe
--symbolized_raw_profile?  Now the names are getting a bit long, but at least 
they're
descriptive...

4) I noticed a few style nits (need a space between 'if' and open parens, etc). 
 You
may want to give the patch a once-over for that.

} but I wanted to get early feedback on whether the design is acceptable.  

Sounds good.  Let me know when you're ready for a more careful look.

Original comment by csilv...@gmail.com on 23 Jun 2009 at 8:07

GoogleCodeExporter commented 9 years ago
I have yet another iteration ready.  I think all the design and tests are in 
place,
and it works in my testing, though it obviously may still have some subtle bugs.

The profile is dumped always in a CPU-profile format.  The reason for that was 
that
you can provide multiple profiles on the command-line (additive), or provide a
--base=profile (subtractive).  I wanted to output the profile which is the 
result of
all these additions/subtractions, just like the other output modes (--text, 
--gif,
whatever) would.  However, I was too lazy to write the 3 different output modes
(memory, contention, CPU).  In addition, it seems like some of the information 
is
lost during profile read/subtraction -- for instance, for contention the input
contains both time waited and number of times waited but this seems to be lost 
by the
time we start really working with the profile.

However, this still works fine with heap and contention profiles with the 
caveat that
the output is now reported in "# of samples" rather than megabytes or seconds 
waited,
so the units may be off by some number of zeroes.  We can expand this in a later
patch to be more complete.

A unittest would be nice, but I'm very unsure on how to do those in the 
gperftools
setup and was hoping you might be able to help with that.

Original comment by mrab...@gmail.com on 30 Jun 2009 at 1:49

Attachments:

GoogleCodeExporter commented 9 years ago
whoops, ignore this patch please, bad diff. new one coming shortly.

Original comment by mrab...@gmail.com on 30 Jun 2009 at 1:52

GoogleCodeExporter commented 9 years ago
ick, I borked something in my git branch :) Will fix and get you the clean patch
tomorrow, sorry for the thrash!

Original comment by mrab...@gmail.com on 30 Jun 2009 at 2:05

GoogleCodeExporter commented 9 years ago
fixed patch attached.  I've changed my mind and am now dumping an exact copy of 
the
remote-fetch profile when used with remote fetch, but outputting a CPU-format 
profile
of my own construction if a local profile file is used.  I think a better logic 
would be:

  if (remote_fetch) {
    output remote collected profile w/ symbols
  } else if (only 1 profile file provided on cmdline) {
    output a copy of the profile file provide (w/ symbols)
  } else {
    do additions/diffing as needed
    output a cpu-style profile of the sum (or difference)
  }

I'll update that in a patch soon, but please take a look at the current work 
and let
me know if you disagree with any of the design.

Original comment by mrab...@gmail.com on 30 Jun 2009 at 2:27

Attachments:

GoogleCodeExporter commented 9 years ago
I looked over the patch briefly.  It definitely looks reasonable to me.  I wish 
we
didn't have to have the code to create a libprofiler-compatible output file -- 
I'm
worried if we change the format in the C++ we'll forget to change it in the 
perl. 
But I don't see a great way around it.

I've passed the patch along to the original authors of pprof inside google, to 
get
their thoughts.  Unfortunately, they're on vacation, so I don't know when 
they'll
have a chance to look at it -- I'm sure they'll be swamped when they get back!  
That
said, I'm not planning a new release of perftools for another month or two, so 
we're
still on good track to get this in before the next release.

Original comment by csilv...@gmail.com on 1 Jul 2009 at 5:00

GoogleCodeExporter commented 9 years ago
I've attached below a code review from one of the people who has worked on 
pprof. 
Take a look at his comments and see what you think.

---
========================================================================
http://mondrian.corp.google.com/file/11714259///depot/google3/perftools/pprof?a=
1
File //depot/google3/perftools/pprof (snapshot 1)
------------------------------------
Line 317: "raw!"      => \$main::opt_raw,
put more spaces before => to be consistent with other options.
------------------------------------
Line 492: my $symbol_map = { };
We use {} in other places in the file.
------------------------------------
Line 2034: sub AddSymbols {
Maybe MergeSymbols() as the comment suggests? :)
------------------------------------
Line 2131: sub CheckCommandResult {
RunAndCheckCommandResult ?

Would be good to add a comment.  Does this return 1 on success?
------------------------------------
Line 2150:
two white spaces on the line?
------------------------------------
Line 2232: sub ReadSymbols {
Comment?  Would be good to mention that this updates $main::prog
------------------------------------
Line 2263: # Fetch symbols from $SYMBOL_PAGE for all PC values found in profile
The comment needs to be updated, as we now have an optional parameter
$symbol_map?
------------------------------------
Line 2273: my $post_data = join("+", sort((map {"0x" . "$_"} @pcs)));
Please add one-level indentation to the "if" block.
------------------------------------
Line 2493:
two spaces on the empty line?
------------------------------------
Line 2512: # the binary cpu profile data starts immediately after this line
Shoulnd't we set

$main::profile_type = 'cpu';

here?
------------------------------------
Line 2540: my $nbytes = read(PROFILE, $str, (stat PROFILE)[7]);   # read entire 
file
Just FYI.  Perl's way to read everything left seems to be:

my $str = do { local($/) ; <PROFILE> } ;

Per http://sysarch.com/Perl/slurp_article.html.

Original comment by csilv...@gmail.com on 2 Jul 2009 at 2:27

GoogleCodeExporter commented 9 years ago
Any chance to look at these code review comments?  I'd like to get this change 
into
the next release, but we'll need to deal with these comments first.

Original comment by csilv...@gmail.com on 31 Jul 2009 at 7:02

GoogleCodeExporter commented 9 years ago
All comments fixed!  Final testing in progress.  I tested all the basic 
operations,
and they seem to work producing identical output after doing something like:

   pprof <binary> <profile> --text > t1
   pprof <binary> <profile> --raw > t.raw
   pprof t.raw --text > t2
   diff t1 t2

We should add a unittest like that :)

Original comment by mrab...@gmail.com on 1 Aug 2009 at 5:00

Attachments:

GoogleCodeExporter commented 9 years ago
} All comments fixed!

Thanks!  I'll have folks here take another look.

One comment I have is that you run 'grep -q -- ...' at some point in
this script.  This isn't super-portable -- only gnu grep supports "--"
I believe, and solaris x86 grep doesn't even support "-q" -- and perl
has a built-in grep.  Maybe use that?  I don't know the the syntax for
sure, but something like
   if (grep { /<whatever>/ } <file>) ...

There's another form
   if (grep(/whatever/, <file>)) ...
which may or may not be better perl style; I don't know perl well
enough to say.

} We should add a unittest like that :)

Hmm, would you like to take a shot at it?  It would be nice to have a
pprof_unittest.sh, actually.  One easy way to do this is to use
src/tests/heap_profiler_unittest.sh as a starting point.  But you'd
change the Verify function to do something like you mention:
    pprof <binary> <profile> --text > t1
    pprof <binary> <profile> --raw > t.raw
    pprof t.raw --text > t2
    diff t1 t2

Original comment by csilv...@gmail.com on 3 Aug 2009 at 5:20

GoogleCodeExporter commented 9 years ago
OK, some other eyes have looked over this here, and have no comments beyond the 
grep
and unittest comments I made.  So once that's done, I think this patch will be 
good
to go in!

Original comment by csilv...@gmail.com on 4 Aug 2009 at 8:27

GoogleCodeExporter commented 9 years ago
Ping -- any more news on this?  I think we're very close to being ready to get 
this
patch in!

Original comment by csilv...@gmail.com on 17 Aug 2009 at 1:53

GoogleCodeExporter commented 9 years ago
Yeah, sorry for the delay.  I will make the unittests and you should
have this by Wed the 19th.

Original comment by mrab...@gmail.com on 17 Aug 2009 at 6:55

GoogleCodeExporter commented 9 years ago
Here is the unittest patch for profiler_unittest.sh!

Original comment by mrab...@gmail.com on 20 Aug 2009 at 6:04

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for the unittest!  There was only one more thing, which was changing the 
grep
call to use the internal grep function.  I'd try it myself, but I'm not a perl 
whiz,
so if you wanted to give it a go, feel free!

Original comment by csilv...@gmail.com on 20 Aug 2009 at 6:25

GoogleCodeExporter commented 9 years ago
Modified pprof to use internal grep function.  FYI, everything I know (and hate)
about perl I learned during this change :)

This, combined with the separate unittest patch, should be good for release.

Original comment by mrab...@gmail.com on 22 Aug 2009 at 8:32

Attachments:

GoogleCodeExporter commented 9 years ago
Great!  We'll look to get this into the next release.

Original comment by csilv...@gmail.com on 24 Aug 2009 at 7:21

GoogleCodeExporter commented 9 years ago
I had to fix up a few remaining problems ($a >> 32 doesn't work right on 32-bit
systems), but the patch works as far as I can tell, and is now part of 
perftools 1.4
(just released).

Original comment by csilv...@gmail.com on 11 Sep 2009 at 6:54