Closed GoogleCodeExporter closed 9 years ago
Hmm, an interesting idea. I'd like to understand the specifics of the patch a
little
more.
It looks like --raw basically just copies $collected_profile. But why do you
need to
do that? Doesn't pprof already save this raw data for you? I see this code:
$main::collected_profile = $real_profile;
where $real_profile is /home/pprof/something. Why can't you just use that
directly?
Alternately, you could set PPROF_TMPDIR before running the script, and put the
profile into another directory besides /home/pprof.
As for --save_sym, I guess you could just as easily run the 'curl' (or 'wget')
command yourself to http://yourhost/pprof/symbols (or whatever the url is), and
store
those symbols somewhere. pprof doesn't do any special processing for this,
does it?
I do like the idea of allowing a symbol file in lieu of an entire executable,
as an
arg to pprof. But instead of requiring a --use_sym_file flag, why not just
detect it
automatically? You should be able to look at the file contents and see if it's
a sym
file or not, and if so to parse it appropriately. (One idea would be to just
parse
the file as if it were a sym file, and if you get anything out of it -- any of
the
regexps you ahve match -- it's probably a sym file.)
As you can tell, I'm trying to reduce the need for commandline flags, which
pprof has
too much of already. The more we can do automatically, or outside pprof when
pprof
doesn't add any value, the happier it makes me. :-)
Let me know what you think.
Original comment by csilv...@gmail.com
on 16 Jun 2009 at 5:02
On Tue, Jun 16, 2009 at 10:03 AM, <codesite-noreply@google.com> wrote:
> It looks like --raw basically just copies $collected_profile. But why do
> you need to
> do that? Doesn't pprof already save this raw data for you? I see this
> code:
> $main::collected_profile = $real_profile;
> where $real_profile is /home/pprof/something. Why can't you just use that
> directly?
> Alternately, you could set PPROF_TMPDIR before running the script, and put
> the
> profile into another directory besides /home/pprof.
I could use that tmp file directly, but I think it's somewhat black
magic as to what it's named. Pprof doesn't expose (right now) what
the name will be -- it's some mangling (I imagine of host, port, time,
whatever) so after several runs of pprof it's hard to say which is the
newest file, and icky. If you don't want a commandline arg, I can do
a PPROF_FORCE_TMPFILE_NAME env-var (with a better name :P) or
something so I can control the mangling and use that in combination
with PPROF_TMPDIR, but I think --raw is pretty clean and simple and
won't confuse people.
> As for --save_sym, I guess you could just as easily run the 'curl' (or
> 'wget')
> command yourself to http://yourhost/pprof/symbols (or whatever the url is),
> and store
> those symbols somewhere. pprof doesn't do any special processing for this,
> does it?
pprof does do some processing: it uses HTTP POST to upload the list of
hex addresses that it needs symbols for, and /pprof/symbols on my host
(or equivalent) only returns the symbols requested. Some binaries
might have a huge number of symbols (some from libs), and stringifying
symbols is somewhat expensive; it's a lot cleaner to enumerate the
ones you need. I am not sure it's reasonable to have a binary return
_all_ its symbols from all of its libs (including shlibs it uses).
> I do like the idea of allowing a symbol file in lieu of an entire
> executable, as an
> arg to pprof. But instead of requiring a --use_sym_file flag, why not just
> detect it
> automatically? You should be able to look at the file contents and see if
> it's a sym
> file or not, and if so to parse it appropriately. (One idea would be to
> just parse
> the file as if it were a sym file, and if you get anything out of it -- any
> of the
> regexps you ahve match -- it's probably a sym file.)
Absolutely, this part I agree with -- we can just provide a "---
symbols" header similarly to how we detect contention profiles and
automatically use it if provided.
> As you can tell, I'm trying to reduce the need for commandline flags, which
> pprof has
> too much of already. The more we can do automatically, or outside pprof
> when pprof
> doesn't add any value, the happier it makes me. :-)
Sure, I completely understand your motivations.
>
> Let me know what you think.
>
My answers inline!
Original comment by mrab...@gmail.com
on 17 Jun 2009 at 6:22
Hmm, when you put it that way, --raw as another output mode does make sense.
So with detecting the sym file automatically, we're just left with the
--save_sym
flag. I admit I still don't love it, just because pprof is kinda designed to
have
only one type of output, and I feel we should probably keep it that way. What
if we
had another output mode that just dumped symbols? --raw-symbols or some such?
Then
you'd have to call pprof twice to do what you want to do, but maybe that's ok?
Anyway, I think a good next step is to send a revision of the patch, that does
the
auto-detection of the raw symbol file, and does whatever you think is best for
saving
symbols. I'll be glad to take another look.
Original comment by csilv...@gmail.com
on 18 Jun 2009 at 3:15
} A final idea that I just had would be to instead combine the two files
} into one and INSERT the symbols into the file we dump with "--raw".
} In fact, since two different profiles would require two different
} symbols files (as the symbol set might differ), it makes perfect sense
} to dump a "combination file" that looks something like:
}
} --- symbols
} 0x3423403203 abc:def:foo() foo.cc:35
} 0x3423403203 abc:def:foo() foo.cc:35
} --- profile
} 2 0x3423403203 0x3423403203 0x3423403203 0x3423403203
} 19 0x3423403203 0x3423403203 0x3423403203 0x3423403203
} 35 0x3423403203 0x3423403203 0x3423403203 0x3423403203
I like this idea! Let's see what a patch to do that looks like.
} Perhaps "--raw" should then be "--symbolized-profile" or similar.
Makes sense.
Original comment by csilv...@gmail.com
on 20 Jun 2009 at 4:33
I have a new patch attached, as we discussed. The new features as follows:
* New option has been added, '--symsprof' which can only be used with remote
fetch
and dumps symbols AND the collected profile to stdout in a new format.
The new format is formed by prefixing a special symbols section to a regular
profile
as follows:
--- symbols
0x999999 ns::ns::func() file.ext:99
0x999999 ns::ns::func() file.ext:99
...
---
<regular format CPU, heap, contention, etc. profile follows>
Pprof, in addition, will now automatically detect when the profiles given on the
commandline contain symbols and will not require a binary file in this case.
This
works for all types of profiles (I tested on CPU and contention).
I will continue testing this more thoroughly to insure it's bulletproof --
please
excuse me if any bugs are currently present, but I wanted to get early feedback
on
whether the design is acceptable.
Original comment by mrab...@gmail.com
on 23 Jun 2009 at 3:18
Attachments:
I just looked over the patch briefly, but the design seems good to me.
A few comments:
1) Why does --symprof mode only work for remote fetches? If you give a binary
and
profile file, can't it emit a symprof file just fine? That might be useful in
certain situations.
2) Maybe include the binary name in the --symprof output. Right now you say
"(unknown)" for the binary name, it looks like.
3) I'm not a huge fan of the name --symprof. It just seems a bit obscure. I
think
--raw would still be fine: we're emitting the raw data we got from this profile
(not
just the profile, but the symbol-table). Or --raw_profile. Or maybe
--symbolized_raw_profile? Now the names are getting a bit long, but at least
they're
descriptive...
4) I noticed a few style nits (need a space between 'if' and open parens, etc).
You
may want to give the patch a once-over for that.
} but I wanted to get early feedback on whether the design is acceptable.
Sounds good. Let me know when you're ready for a more careful look.
Original comment by csilv...@gmail.com
on 23 Jun 2009 at 8:07
I have yet another iteration ready. I think all the design and tests are in
place,
and it works in my testing, though it obviously may still have some subtle bugs.
The profile is dumped always in a CPU-profile format. The reason for that was
that
you can provide multiple profiles on the command-line (additive), or provide a
--base=profile (subtractive). I wanted to output the profile which is the
result of
all these additions/subtractions, just like the other output modes (--text,
--gif,
whatever) would. However, I was too lazy to write the 3 different output modes
(memory, contention, CPU). In addition, it seems like some of the information
is
lost during profile read/subtraction -- for instance, for contention the input
contains both time waited and number of times waited but this seems to be lost
by the
time we start really working with the profile.
However, this still works fine with heap and contention profiles with the
caveat that
the output is now reported in "# of samples" rather than megabytes or seconds
waited,
so the units may be off by some number of zeroes. We can expand this in a later
patch to be more complete.
A unittest would be nice, but I'm very unsure on how to do those in the
gperftools
setup and was hoping you might be able to help with that.
Original comment by mrab...@gmail.com
on 30 Jun 2009 at 1:49
Attachments:
whoops, ignore this patch please, bad diff. new one coming shortly.
Original comment by mrab...@gmail.com
on 30 Jun 2009 at 1:52
ick, I borked something in my git branch :) Will fix and get you the clean patch
tomorrow, sorry for the thrash!
Original comment by mrab...@gmail.com
on 30 Jun 2009 at 2:05
fixed patch attached. I've changed my mind and am now dumping an exact copy of
the
remote-fetch profile when used with remote fetch, but outputting a CPU-format
profile
of my own construction if a local profile file is used. I think a better logic
would be:
if (remote_fetch) {
output remote collected profile w/ symbols
} else if (only 1 profile file provided on cmdline) {
output a copy of the profile file provide (w/ symbols)
} else {
do additions/diffing as needed
output a cpu-style profile of the sum (or difference)
}
I'll update that in a patch soon, but please take a look at the current work
and let
me know if you disagree with any of the design.
Original comment by mrab...@gmail.com
on 30 Jun 2009 at 2:27
Attachments:
I looked over the patch briefly. It definitely looks reasonable to me. I wish
we
didn't have to have the code to create a libprofiler-compatible output file --
I'm
worried if we change the format in the C++ we'll forget to change it in the
perl.
But I don't see a great way around it.
I've passed the patch along to the original authors of pprof inside google, to
get
their thoughts. Unfortunately, they're on vacation, so I don't know when
they'll
have a chance to look at it -- I'm sure they'll be swamped when they get back!
That
said, I'm not planning a new release of perftools for another month or two, so
we're
still on good track to get this in before the next release.
Original comment by csilv...@gmail.com
on 1 Jul 2009 at 5:00
I've attached below a code review from one of the people who has worked on
pprof.
Take a look at his comments and see what you think.
---
========================================================================
http://mondrian.corp.google.com/file/11714259///depot/google3/perftools/pprof?a=
1
File //depot/google3/perftools/pprof (snapshot 1)
------------------------------------
Line 317: "raw!" => \$main::opt_raw,
put more spaces before => to be consistent with other options.
------------------------------------
Line 492: my $symbol_map = { };
We use {} in other places in the file.
------------------------------------
Line 2034: sub AddSymbols {
Maybe MergeSymbols() as the comment suggests? :)
------------------------------------
Line 2131: sub CheckCommandResult {
RunAndCheckCommandResult ?
Would be good to add a comment. Does this return 1 on success?
------------------------------------
Line 2150:
two white spaces on the line?
------------------------------------
Line 2232: sub ReadSymbols {
Comment? Would be good to mention that this updates $main::prog
------------------------------------
Line 2263: # Fetch symbols from $SYMBOL_PAGE for all PC values found in profile
The comment needs to be updated, as we now have an optional parameter
$symbol_map?
------------------------------------
Line 2273: my $post_data = join("+", sort((map {"0x" . "$_"} @pcs)));
Please add one-level indentation to the "if" block.
------------------------------------
Line 2493:
two spaces on the empty line?
------------------------------------
Line 2512: # the binary cpu profile data starts immediately after this line
Shoulnd't we set
$main::profile_type = 'cpu';
here?
------------------------------------
Line 2540: my $nbytes = read(PROFILE, $str, (stat PROFILE)[7]); # read entire
file
Just FYI. Perl's way to read everything left seems to be:
my $str = do { local($/) ; <PROFILE> } ;
Per http://sysarch.com/Perl/slurp_article.html.
Original comment by csilv...@gmail.com
on 2 Jul 2009 at 2:27
Any chance to look at these code review comments? I'd like to get this change
into
the next release, but we'll need to deal with these comments first.
Original comment by csilv...@gmail.com
on 31 Jul 2009 at 7:02
All comments fixed! Final testing in progress. I tested all the basic
operations,
and they seem to work producing identical output after doing something like:
pprof <binary> <profile> --text > t1
pprof <binary> <profile> --raw > t.raw
pprof t.raw --text > t2
diff t1 t2
We should add a unittest like that :)
Original comment by mrab...@gmail.com
on 1 Aug 2009 at 5:00
Attachments:
} All comments fixed!
Thanks! I'll have folks here take another look.
One comment I have is that you run 'grep -q -- ...' at some point in
this script. This isn't super-portable -- only gnu grep supports "--"
I believe, and solaris x86 grep doesn't even support "-q" -- and perl
has a built-in grep. Maybe use that? I don't know the the syntax for
sure, but something like
if (grep { /<whatever>/ } <file>) ...
There's another form
if (grep(/whatever/, <file>)) ...
which may or may not be better perl style; I don't know perl well
enough to say.
} We should add a unittest like that :)
Hmm, would you like to take a shot at it? It would be nice to have a
pprof_unittest.sh, actually. One easy way to do this is to use
src/tests/heap_profiler_unittest.sh as a starting point. But you'd
change the Verify function to do something like you mention:
pprof <binary> <profile> --text > t1
pprof <binary> <profile> --raw > t.raw
pprof t.raw --text > t2
diff t1 t2
Original comment by csilv...@gmail.com
on 3 Aug 2009 at 5:20
OK, some other eyes have looked over this here, and have no comments beyond the
grep
and unittest comments I made. So once that's done, I think this patch will be
good
to go in!
Original comment by csilv...@gmail.com
on 4 Aug 2009 at 8:27
Ping -- any more news on this? I think we're very close to being ready to get
this
patch in!
Original comment by csilv...@gmail.com
on 17 Aug 2009 at 1:53
Yeah, sorry for the delay. I will make the unittests and you should
have this by Wed the 19th.
Original comment by mrab...@gmail.com
on 17 Aug 2009 at 6:55
Here is the unittest patch for profiler_unittest.sh!
Original comment by mrab...@gmail.com
on 20 Aug 2009 at 6:04
Attachments:
Thanks for the unittest! There was only one more thing, which was changing the
grep
call to use the internal grep function. I'd try it myself, but I'm not a perl
whiz,
so if you wanted to give it a go, feel free!
Original comment by csilv...@gmail.com
on 20 Aug 2009 at 6:25
Modified pprof to use internal grep function. FYI, everything I know (and hate)
about perl I learned during this change :)
This, combined with the separate unittest patch, should be good for release.
Original comment by mrab...@gmail.com
on 22 Aug 2009 at 8:32
Attachments:
Great! We'll look to get this into the next release.
Original comment by csilv...@gmail.com
on 24 Aug 2009 at 7:21
I had to fix up a few remaining problems ($a >> 32 doesn't work right on 32-bit
systems), but the patch works as far as I can tell, and is now part of
perftools 1.4
(just released).
Original comment by csilv...@gmail.com
on 11 Sep 2009 at 6:54
Original issue reported on code.google.com by
mrab...@gmail.com
on 15 Jun 2009 at 12:12Attachments: