Use -debug-info-kind=constructor for limited debug info in clang

Quuxplusone commented 4 years ago


Bugzilla Link	PR46537
Status	NEW
Importance	P enhancement
Reported by	Amy Huang (akhuang@google.com)
Reported on	2020-07-01 18:00:36 -0700
Last modified on	2020-08-18 11:59:30 -0700
Version	unspecified
Hardware	All All
CC	dblaikie@gmail.com, echristo@gmail.com, i@maskray.me, llvm-bugs@lists.llvm.org, maskray@google.com, neeilans@live.com, richard-llvm@metafoo.co.uk
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

This is the tracking bug for enabling constructor type homing (-debug-info-kind=constructor) in clang's limited debug info mode.

Constructor homing reduces the number of times class type info is emitted by only emitting it when constructors are emitted, so it reduces object file size.

Quuxplusone commented 4 years ago

I think this happened here: 227db86a1b7dd6f96f7df14890fcd071bc4fe1f5 ?

Quuxplusone commented 4 years ago

reverted in 394db2259575ef3cac8d3d37836b11eb2373c435 because it was causing warnings/errors in google builds, not sure about the details.

There's also a bug mentioned on the phabricator review (https://reviews.llvm.org/D79147), where ctor homing doesn't emit the definition for struct t1 in the below example

struct t1 {
  t1() = default;
  t1(int);
  int i;
};
t1 v1;

This was causing a test to fail in debuginfo-tests, which I apparently hadn't tried running.

I guess this is just a mistake in the ctor homing logic; it shouldn't try to home debug info for classes that have trivial constructors.

Quuxplusone commented 4 years ago

Might not be actionable/important, but this got me thinking about other cases
where ctor homing might fail. Specifically around optimizations that might
remove any mention of a type from a translation unit that does construct the
type (& passes it off to another translation unit that might then use that
type).

eg 1)
struct base { };
struct t1 : base {
  t1() { }
};
base* f1() {
  return new t1();
}

Since the ctor doesn't have any work to do, no instructions are attributed to
the ctor and so no trace of "t1" is kept alive in the debug info graph.

Despite not having a vtable, this could still be used in valid/plausible code -
a generic factory, then a corresponding generic destruction routine ("void
destroy(base *b) { delete static_cast<t1*>(b); }") and maybe the client code
knows the specific type in certain circumstances, casts back and uses it, etc.

eg 2)
struct base { };
void ctor_impl();
struct t1: base {
  int i;
  t1() { ctor_impl(); }
};
void f2();
base* f1(bool b) {
  if (b) {
    f2();
    return new t1();
  }
  return new t1();
}

Despite the ctor having non-trivial work (calling ctor_impl), because the two
ctors got coalesced together, the inline info had to be dropped (since neither
location could be correctly attributed to a specific ctor call).

This one might be the sort of thing we should fix in another way - if it
weren't for the inlining, we'd have two call sites to t1() that we'd say
happened at line 0. So perhaps when we inline them we should create a new
inline call site at location 0, improving the stack trace and keeping the type
debug info here.

I think in general though, ctor homing should probably do the same thing as
explicit template specialization homing, and add the type to the retained types
list. So there's no chance the metadata holding the type live could be
optimized away, and the type lost.

See CGDebugInfo::completeTemplateDefinition for an example of using
CGDebugInfo::completeUnusedClass that ensures the type is added to the retained
types list.

Quuxplusone commented 4 years ago

In google3 we were getting the error "fragment covers entire variable". It's a result of -debug-info-kind=constructor not emitting a class definition in a particular IR file. I'd guess it's also possible to run into this with -debug-info-kind=limited.

Repro is something like this (I haven't yet figured out how to run the verifier pass that creates the error; this repro just creates the IR file) $ cat repro.cc class thing { public: thing(const thing&) = default; class Impl; private: const Impl* impl_; };

class Thing { thing x_; };

Thing CreateThing(); Thing t() { static Thing x = []() -> Thing { Thing t = CreateThing(); return t; }(); return x; } $ clang -Xclang -debug-info-kind=constructor -O3 repro.cc -S -emit-llvm -o repro.ll

Basically transferSRADebugInfo creates a fragment for the DIGlobalVariable x because it doesn't have a variable size (as a result of Thing not being defined).

I don't know what fragments are / what they're used for; perhaps we should just not emit a fragment in this scenario?

Quuxplusone commented 4 years ago

I guess the issue is that transferSRADebugInfo uses the DIGlobalVariableExpression to get the variable size, and the type might not be defined in the debug info. But transferSRADebugInfo should be able to get the variable size from the GlobalVariable?

Quuxplusone commented 4 years ago

(In reply to Amy Huang from comment #5)

I guess the issue is that transferSRADebugInfo uses the DIGlobalVariableExpression to get the variable size, and the type might not be defined in the debug info. But transferSRADebugInfo should be able to get the variable size from the GlobalVariable?

Sounds plausible - are there any small/good test cases for transferSRADebugInfo that could be possibly readily hand-modified (removing a definition/reducing it to a declaration) to demonstrate the issue in a narrow way?

Quuxplusone commented 4 years ago

(In reply to David Blaikie from comment #6)

(In reply to Amy Huang from comment #5)

I guess the issue is that transferSRADebugInfo uses the DIGlobalVariableExpression to get the variable size, and the type might not be defined in the debug info. But transferSRADebugInfo should be able to get the variable size from the GlobalVariable?

Sounds plausible - are there any small/good test cases for transferSRADebugInfo that could be possibly readily hand-modified (removing a definition/reducing it to a declaration) to demonstrate the issue in a narrow way?

whoops, I put up a patch for it but somehow didn't cc you on it https://reviews.llvm.org/D85572. That's what I ended up doing for the test case.

Quuxplusone commented 4 years ago

(In reply to Amy Huang from comment #7)

(In reply to David Blaikie from comment #6)

(In reply to Amy Huang from comment #5)

I guess the issue is that transferSRADebugInfo uses the DIGlobalVariableExpression to get the variable size, and the type might not be defined in the debug info. But transferSRADebugInfo should be able to get the variable size from the GlobalVariable?

Sounds plausible - are there any small/good test cases for transferSRADebugInfo that could be possibly readily hand-modified (removing a definition/reducing it to a declaration) to demonstrate the issue in a narrow way?

whoops, I put up a patch for it but somehow didn't cc you on it https://reviews.llvm.org/D85572. That's what I ended up doing for the test case.

ah, looks good!

Quuxplusone commented 4 years ago

There is now a cc1 flag, -fuse-ctor-homing that will enable constructor homing only if limited debug info is already enabled (added in ae6523cd62a44642390f4f9d9ba9ce17d5bbbc58).

Quuxplusone / LLVMBugzillaTest

Use -debug-info-kind=constructor for limited debug info in clang #45507