qiuzijian / google-breakpad

Automatically exported from code.google.com/p/google-breakpad
0 stars 0 forks source link

Linux DWARF dumper outputs FUNC line with blank names for constructors (and destructors?) #364

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Probably assign this to jimb. My DWARF-foo is weak, but I'll try to take a 
look too.

What steps will reproduce the problem?
1. Using google-breakpad r497 on Linux and g++ 4.2.4 from Ubuntu Hardy.
2. Build src/tools/linux/dump_syms/dump_syms
3. Compile the attached program with:

g++ -m32 -g test.cc -o test_dwarf
g++ -m32 -gstabs test.cc -o test_stabs

and run dump_syms on both binaries.

What is the expected output? What do you see instead?

For Foo::Foo(), test_dwarf outputs:

"FUNC 510 10 0 " <- minus the quotes
510 3 5 0
513 b 6 0
51e 2 7 0

while test_stabs outputs:

FUNC 510 10000000 0 Foo::Foo(int)
510 3 5 0
513 b 6 0
51e ffffff2 7 0

Other comments:

The stabs parameter size also looks funny, but that would be a separate 
problem.

Original issue reported on code.google.com by thestig@chromium.org on 26 Jan 2010 at 1:11

Attachments:

GoogleCodeExporter commented 9 years ago
I can reproduce this.

Original comment by jimbla...@gmail.com on 26 Jan 2010 at 2:19

GoogleCodeExporter commented 9 years ago
The problem is that the constructor is declared inline, so the definition DIE 
has no
low/high_pc attributes. We do have an entry for a concrete instance, which is 
what
produces the FUNC record, but that refers to the definition DIE via a
DW_AT_abstract_origin attribute.

I would expect the Mac dumper to have this bug as well --- is that the case?

Fixing this bug is probably a step towards inline function support, since we'll
follow DW_AT_abstract_origin references.

Original comment by jimbla...@gmail.com on 26 Jan 2010 at 2:24

GoogleCodeExporter commented 9 years ago
This is what I get on an OSX 10.5 machine:

MODULE mac x86 6C42A26FA6125E81E8193464264927F90 test_dwarf
PUBLIC e8c 0 start
PUBLIC ecc 0 dyld_stub_binding_helper
PUBLIC ee0 0 _dyld_func_lookup
PUBLIC eee 0 Foo::print()
PUBLIC f18 0 main
PUBLIC f56 0 Foo::Foo(int)
PUBLIC 1000 0 __progname
PUBLIC 1004 0 environ
PUBLIC 1008 0 NXArgv
PUBLIC 100c 0 NXArgc
PUBLIC 1010 0 dyld__mach_header

Original comment by thestig@chromium.org on 26 Jan 2010 at 9:48

GoogleCodeExporter commented 9 years ago
The Mac dumper probably uses DW_AT_MIPS_linkage_name.  Does the constructor's 
DIE
have such an attribute on the Mac?  The new parser tries to avoid using that, 
since
GCC has been providing that information in the proper way for years now. (Long 
ago,
GCC produced lousy DWARF for C++, and the debugger needed the linkage name.)

Regardless --- it's a regression compared to STABS, so this needs to be fixed
quickly. I'm finishing up the CFI changes; I'll get to this next.

Original comment by jimbla...@gmail.com on 26 Jan 2010 at 11:15

GoogleCodeExporter commented 9 years ago
>The stabs parameter size also looks funny, but that would be a separate 
>problem

Are you referring to this?

>FUNC 510 10000000 0 Foo::Foo(int)

The stabs reader has always produced '0' for the parameter size.  On Linux, the 
only
three debugging regimes we've used are "traditional" %ebp-linked frames, DWARF 
CFI,
and having the debugger disassemble the machine code starting at the function's 
entry
point to figure out the frame format. We've never used parameter sizes, the way
Windows does.

Original comment by jimbla...@gmail.com on 27 Jan 2010 at 1:50

GoogleCodeExporter commented 9 years ago
No, I was referring to size=10000000 -> 256 MB?

Original comment by thestig@chromium.org on 27 Jan 2010 at 6:40

GoogleCodeExporter commented 9 years ago
Constructors are big.  Americans especially like big constructors, the same way 
they
like big construction equipment, even though the Japanese (for example) do just 
fine
with much smaller excavators, backhoes, and so on.

STABS doesn't indicate the ending addresses of functions; the reader needs to 
infer
it from the address of the next function, or from the ending address of the
compilation unit, which STABS also sometimes doesn't provide. When the STABS 
reader
can't find any upper bound for the function (it would have to be the last 
function in
the file), 256MiB is its standard guess.

(This behavior predates my rewrite; I just preserved it.)

Original comment by jimbla...@gmail.com on 27 Jan 2010 at 7:36

GoogleCodeExporter commented 9 years ago
One other thing I've noticed is that the DWARF names don't seem to have the 
type info
-- Foo::Foo vs. Foo::Foo(int), though Lei's OSX dump above does seem to have 
it.  Any
pointers/thoughts on how to restore the type information?

I've been able to make a small change to dwarf_cu_to_module.cc that gets the 
name
attached in Lei's example.  Is the name the only piece of information that 
needs to
be looked up via the DW_AT_abstract_origin attribute?

From this:

> STABS doesn't indicate the ending addresses of functions; the reader needs to 
infer
> it from the address of the next function, or from the ending address of the
> compilation unit, which STABS also sometimes doesn't provide. When the STABS 
reader
> can't find any upper bound for the function (it would have to be the last 
function
> in the file), 256MiB is its standard guess.

I take it that we don't need to worry about the STABS/DWARF difference in this 
case?

Original comment by dm...@google.com on 28 Jan 2010 at 9:19

GoogleCodeExporter commented 9 years ago
The argument types are coming from the mangled name: you pass the mangled name 
to the
demangler and it gives you the fully-qualified name with argument types.  As I 
say,
the mangled name isn't always available, so I'd like to avoid depending on it.  
But
adding types to the reader, even if only to print them, is not going to be a 
small
change.  Maybe we should start consuming the mangled names... :(

In principle, any attribute that doesn't vary from one inlined instance to 
another
should be retrieved via the DW_AT_abstract_origin attribute. If you could put 
your
patch for recovering the name via the DW_AT_abstract_origin link up on rietveld
(http://breakpad.appspot.com/), and list me as the reviewer, I'll take a look 
at it.

> I take it that we don't need to worry about the STABS/DWARF difference
> in this case?

Yeah; that 0x10000000 is not important.

Original comment by jimbla...@gmail.com on 28 Jan 2010 at 9:56

GoogleCodeExporter commented 9 years ago
Using dwarfdump on the test program, I noticed that the print method has a
DW_AT_MIPS_linkage_name attribute containing the mangled name, but the 
constructor
doesn't get a DW_AT_MIPS_linkage_name attribute.  I guess this is a case where 
the
mangled name isn't always available.

I can see the outlines of what needs to be done to add types to the reader, and 
it
doesn't look too bad, except that it appears even in this simple case that some 
types
are used before they are defined.  If mangled names not being available is 
fairly
common (and since I'm seeing it in such a simple example, I'm assuming it is) 
then
it's probably worth it to go ahead & do it the right way.  Thoughts?

Original comment by dm...@google.com on 28 Jan 2010 at 10:17

GoogleCodeExporter commented 9 years ago
And in case it wasn't clear from my last comment, yes, I'm volunteering to add 
types
to the reader, if that's the best way to go.

Original comment by dm...@google.com on 28 Jan 2010 at 10:22

GoogleCodeExporter commented 9 years ago
It's not worth adding a dependency on mangled names if it won't get us all the 
way.

That's great that you're willing to work on types! Could you open a separate 
bug for
that work, though?

Original comment by jimbla...@gmail.com on 28 Jan 2010 at 10:52

GoogleCodeExporter commented 9 years ago
I'll open a separate bug.  Are types needed/useful for anything beyond 
producing the
type information in function/member names?

Original comment by dm...@google.com on 28 Jan 2010 at 11:19

GoogleCodeExporter commented 9 years ago
At the moment, that's all they're used for.  The Breakpad symbol file format is 
very
simple; it just deals with the machine code->source code mapping, and never 
mentions
variables or types.

http://code.google.com/p/google-breakpad/wiki/SymbolFiles

Original comment by jimbla...@gmail.com on 29 Jan 2010 at 5:41

GoogleCodeExporter commented 9 years ago
Doug has posted a patch for this and I've commented, here:
http://breakpad.appspot.com/55006/show

Original comment by jimbla...@gmail.com on 30 Jan 2010 at 1:15

GoogleCodeExporter commented 9 years ago
I've landed Doug's patch for this as r520.

Original comment by jimbla...@gmail.com on 10 Feb 2010 at 5:57