[DWARF] Add count of template methods to the class definition

Quuxplusone commented 6 years ago


Bugzilla Link	PR37816
Status	NEW
Importance	P enhancement
Reported by	Paul Robinson (paul_robinson@playstation.sony.com)
Reported on	2018-06-15 11:49:07 -0700
Last modified on	2018-10-25 20:12:02 -0700
Version	unspecified
Hardware	All All
CC	aprantl@apple.com, clayborg@gmail.com, dblaikie@gmail.com, echristo@gmail.com, labath@google.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

Add a DWARF attribute to the class definition, reporting a count of
template methods (not instantiations) defined within the class.

In a discussion between myself and Greg Clayton
http://lists.llvm.org/pipermail/llvm-dev/2018-June/124085.html
this seemed to be a fairly cheap way to provide LLDB with useful
information to help it translate DWARF into Clang ASTs without
excessive performance cost.

Of course this would have to be plumbed through from Clang CodeGen
through DIBuilder and down into DwarfDebug, with requisite stops
along the way in the bitcode reader/writer.  It can't be derived
within LLVM because this is a count of template *definitions* not
*instantiations*.

Probably worth proposing for DWARF v6; in the meantime it would
have to be an LLVM extension attribute.

Quuxplusone commented 6 years ago

Wouldn't it be better to list the names of the methods, instead of just their count?

If you have just a count you still have trawl through every possible definition of the given class. That's still a lot of work and I don't think template methods are than uncommon (e.g., pretty much any STL class has at least one template method. So if we need std::vector, we would have to open any CU which uses std::vector, which is probably all of them.).

If we have a name, then we could use the accelerator table to point us directly to all uses of e.g. std::vector::emplace_back. (This assumes the accelerator table will have an entry for "emplace_back", which right now isn't true (it will contain "emplace_back"), but that is something we've also been talking about changing.)

Quuxplusone commented 6 years ago

TL;DR: Regardless of the optimal solution, a count attribute is not
particularly costly or harmful, and in most cases you want it anyway.

The question is, how much extra stuff are you willing to impose on the
compiler, in order for LLDB to get the information it wants.  LLDB is
unique (AFAIK) in wanting to build a Clang AST, with the method templates
included.  DWARF was never intended to build an AST.

Currently you would have to parse all DWARF everywhere to be sure you had
all the method templates.  Very expensive for the debugger.

A count attribute is very low overhead for the debug info and pretty low
overhead for the compiler.  It also tells LLDB whether the compiler is
aware that LLDB wants this information, which is something Greg was very
concerned about.  With the index you can find all other definitions (i.e.,
other than the one you picked to be the canonical definition) of that
class and scan them for method template instantiations.  More work than
you would *prefer* but obviously much faster than parsing all DWARF.
You would be looking only at the subprogram children of the class, to
see whether they have template-parameter children; I'd think you could
derive this mostly by looking at abbreviations, with minimal decoding
of the class itself.

If we added method-instantiation entries to the index, we'd still need to
address the concern of whether the compiler knew you wanted them.  The
index does have an augmentation string so you'd be able to tell whether
it contained extra template-method entries.  However you would have to do
a linear search of the entire index to find them.  (The name table is not
sorted.)  The names are all references into .debug_str so that scan is not
particularly fast.  Because you don't know the template-parameter parts of
the names of the instances, you can't use the hash table.

If we added method templates to DWARF, you get a "list" by defining a new
tag type and add a child to the class for each template.  Again you would
need some indication that the producer knew you wanted these, so that the
absence of method template children definitively meant there aren't any.
This is higher overhead for the compiler and debug info, but obviously
cheapest for the debugger because you don't need to go looking at any
other class definitions.

So, pretty much any "complete" solution requires some indication that the
compiler knows you want the information, and the most straightforward way
to do that is to define an LLDB-specific attribute on the class, which is
to say the count of template methods.  The actual value of the count is
not relevant if we go with adding method template tags to the class, and
in that case could degenerate to a flag, but the number is otherwise an
important datum because it lets you know when you are done looking for
things elsewhere.

Quuxplusone commented 6 years ago

I agree that we will also need to provide the ability to say "you don't need to
look elsewhere, this class definition has complete information". My argument
wasn't so much *against* that extra attribute, as it was *for* a method list.

I also agree that we need to balance multiple aspects here. The size aspect is
important although it is not among my topmost concerns.. What bothers me more
is that dwarf so far has only described fully-instantiated entities, and
putting information about uninstantiated templates is a departure from that.
This is particularly important as this requirement of lldb is somewhat
"artificial", as one can imagine a world where lldb does what it is doing now,
even without incurring the overhead of parsing all class definitions.

Quuxplusone commented 6 years ago

Right, we are in "violent agreement" :-) for sure I don't see a new tag
to describe method templates as becoming part of the DWARF standard.

My starting point for that long post was that there are two ways to emit
a "list" in DWARF.  One is to have a new tag and do children, which is
what I was describing.  The second way, which I didn't get to, is to have
yet another special section that has the list you want, which can clearly
have info in whatever format the compiler and debugger can agree on as
meeting an appropriate balance of work for each of them.

It's possible that a count of method templates could become standard
DWARF.  Adding entries to the index with an appropriate augmentation
string is entirely within the standard now.  Anything more than that is
unlikely to be adopted, and would be a special understanding between
LLDB and any compilers it could persuade to emit the same extra info.

Quuxplusone / LLVMBugzillaTest

[DWARF] Add count of template methods to the class definition #36789