Closed Michael137 closed 1 year ago
@llvm/issue-subscribers-lldb
Oh looks like for unions we're mistakenly adding an unnamed
padding FieldDecl
in DWARFASTParserClang::ParseSingleMember
. This confuses the RecordLayoutBuilder
because the Attrs
type now claims to have a FieldDecl
when it actually doesn't; regular
should be a VarDecl
(at least that's what it is when dumping the actual clang AST of this file).
// Handle static members, which is any member that doesn't have a bit or a
// byte member offset.
if (attrs.member_byte_offset == UINT32_MAX &&
attrs.data_bit_offset == UINT64_MAX) {
Type *var_type = die.ResolveTypeUID(die: attrs.encoding_form.Reference());
This is what we do for structures that have static members. But this doesn't happen for unions.
Because of this:
MemberAttributes::MemberAttributes(const DWARFDIE &die,
const DWARFDIE &parent_die,
ModuleSP module_sp) {
member_byte_offset = (parent_die.Tag() == DW_TAG_union_type) ? 0 : UINT32_MAX;
That logic was added in: https://github.com/llvm/llvm-project/commit/aaae5f87af8945cdeac81e86429a303954473a88
[DWARFASTParserClang] Start with member offset of 0 for members of union types.
Summary:
GCC does not emit DW_AT_data_member_location for members of a union.
Starting with a 0 value for member locations helps is reading union types
in such cases.
So it looks like the change in https://github.com/llvm/llvm-project/commit/aaae5f87af8945cdeac81e86429a303954473a88 directly conflicts with the change in https://github.com/llvm/llvm-project/issues/55040.
The latter makes it so we always check the presence of member_byte_offset
to determine whether we have a static data member. Whereas the former unconditionally adds a non-sentinel default member_byte_offset
to all union data members (regardless of whether they're static or not).
The effect of this is that LLDB never correctly detects that a union has static data members.
From both commit descriptions it sounds like both changes were done to support GCC (it neither emits a DW_AT_data_member_location
for union members, nor does it emit DW_AT_external
in some cases for static data members, e.g., in anonymous namespaces).
AFAICT, the change in https://github.com/llvm/llvm-project/commit/aaae5f87af8945cdeac81e86429a303954473a88 is only necessary so that we stop claiming non-static data members are static for GCC-built binaries.
I see that DW_AT_declaration
gets attached to non-static data members in both gcc and clang (and regardless of anonymous namespace). Could we use that as a heuristic? Then the solution could be as simple as:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index aa45076bf2f7..8aaae7772df2 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -6,6 +6,7 @@
//
//===----------------------------------------------------------------------===//
+#include <climits>
#include <cstdlib>
#include "DWARFASTParser.h"
@@ -2494,8 +2495,9 @@ struct MemberAttributes {
DWARFFormValue encoding_form;
/// Indicates the byte offset of the word from the base address of the
/// structure.
- uint32_t member_byte_offset;
+ uint32_t member_byte_offset = UINT32_MAX;
bool is_artificial = false;
+ bool is_declaration = false;
};
/// Parsed form of all attributes that are relevant for parsing Objective-C
@@ -2668,8 +2670,6 @@ DiscriminantValue &VariantPart::discriminant() { return this->_discriminant; }
MemberAttributes::MemberAttributes(const DWARFDIE &die,
const DWARFDIE &parent_die,
ModuleSP module_sp) {
- member_byte_offset = (parent_die.Tag() == DW_TAG_union_type) ? 0 : UINT32_MAX;
-
DWARFAttributes attributes = die.GetAttributes();
for (size_t i = 0; i < attributes.Size(); ++i) {
const dw_attr_t attr = attributes.AttributeAtIndex(i);
@@ -2697,6 +2697,9 @@ MemberAttributes::MemberAttributes(const DWARFDIE &die,
case DW_AT_data_bit_offset:
data_bit_offset = form_value.Unsigned();
break;
+ case DW_AT_declaration:
+ is_declaration = form_value.Boolean();
+ break;
case DW_AT_data_member_location:
if (form_value.BlockData()) {
Value initialValue(0);
@@ -2935,10 +2938,11 @@ void DWARFASTParserClang::ParseSingleMember(
if (class_is_objc_object_or_interface)
attrs.accessibility = eAccessNone;
- // Handle static members, which is any member that doesn't have a bit or a
- // byte member offset.
- if (attrs.member_byte_offset == UINT32_MAX &&
- attrs.data_bit_offset == UINT64_MAX) {
+ // Handle static members, which is any member that's a non-defining declaration.
+ // GCC never emits a DW_AT_data_member_location for members of a union, so
+ // we can't use it as a heuristic to determine whether a data member is static
+ // or not.
+ if (attrs.is_declaration) {
Type *var_type = die.ResolveTypeUID(attrs.encoding_form.Reference());
@clayborg @jingham Any preference on what to do here? Looks like GCC still isn't attaching DW_AT_data_member_location
to non-static union members (whereas clang does).
CC @dwblaikie @adrian-prantl @labath because DWARF
As an aside, do we need to emit the DW_AT_data_member_location
for union members in clang at all? If they are all going to be 0 anyway?
Hmmm looks like DWARFv5 says we should emit static data members as DW_TAG_variable
instead of DW_TAG_member
?
5.7.7 Class Variable Entries A class variable (“static data member” in C++) is a variable shared by all instances of a class. It is represented by a debugging information entry with the tag DW_TAG_variable.
That would make it much easier to differentiate static vs non-static members. Don't think clang does this yet (gcc does)
So to summarise:
DW_AT_declaration
instead of DW_AT_external
/DW_AT_member_location
as a heuristic for whether a member is a static? This would avoid the anonymous namespace problem with GCC. If so, then we can also stop doing the default-initialisation of the member_byte_offsetDW_TAG_variable
for static data members in DWARFv5 and make LLDB understand this?DW_AT_data_member_location
for union members at all if they are all going to be 0 anyway? I guess the space savings of this wouldn't warrant the complexity for consumers, although that would align with what gcc is doingShould clang emit DW_TAG_variable for static data members in DWARFv5 and make LLDB understand this?
If the spec says so, I think that's an easy yes. You might need to make the behavior adjustable for some debugger tunings (though I'd argue the default should emit standards-conformant DWARF).
CC @dwblaikie
Summary is, if I'm reading this right:
DW_AT_external
, but that's not correct/portable, you could have non-external static members if they were members of a non-linkage class, such as a class in an anonymous namespaceDW_AT_data_member_offset
to differentiate static and non-static member variablesDW_AT_data_member_offset
of zero, because GCC emits union non-static members without a DW_AT_data_member_offset
(since they'd all be zero anyway) - unclear if that's to-spec or not? shrugDW_AT_data_member_offset
So we're discussing changing teh way clang emits static members to use DW_TAG_variable
instead of DW_TAG_member
to make it less ambiguous/solve the above issues "once and for all" fingers crossed
And you've tagged me as a heads up that we might be/are going to go in this direction/check I'm cool with it - yeah, sounds good to me. Doing things more clearly, and if that happens to be "more like GCC" I'm generally for it - they're still the bigger fish in the DWARF sea, and it's usually a safe direction to head unless there's extenuating circumstances.
At least our internal use ships pretty close to LLDB, so we don't have a particular need for some clang mode that's deeply backwards compatible (the inverse is somewhat true - be good if LLDB didn't just totally give up on doing some of the old heuristics - still able to debug old binaries as best it can, which seems like it wouldn't be too expensive/problematic to maintain for a while at least) & this shouldn't present a GDB compatibility issue, since it'll be closer to what GCC produces anyway.
Summary is, if I'm reading this right:
- LLDB used to detect static members by the presence of
DW_AT_external
, but that's not correct/portable, you could have non-external static members if they were members of a non-linkage class, such as a class in an anonymous namespace- LLDB then started using the presence of
DW_AT_data_member_offset
to differentiate static and non-static member variables- LLDB also started assuming members of a union have a
DW_AT_data_member_offset
of zero, because GCC emits union non-static members without aDW_AT_data_member_offset
(since they'd all be zero anyway) - unclear if that's to-spec or not? shrug- That broke static-member detection inside unions because all the members got an implicit
DW_AT_data_member_offset
So we're discussing changing teh way clang emits static members to use
DW_TAG_variable
instead ofDW_TAG_member
to make it less ambiguous/solve the above issues "once and for all" fingers crossedAnd you've tagged me as a heads up that we might be/are going to go in this direction/check I'm cool with it - yeah, sounds good to me. Doing things more clearly, and if that happens to be "more like GCC" I'm generally for it - they're still the bigger fish in the DWARF sea, and it's usually a safe direction to head unless there's extenuating circumstances.
At least our internal use ships pretty close to LLDB, so we don't have a particular need for some clang mode that's deeply backwards compatible (the inverse is somewhat true - be good if LLDB didn't just totally give up on doing some of the old heuristics - still able to debug old binaries as best it can, which seems like it wouldn't be too expensive/problematic to maintain for a while at least) & this shouldn't present a GDB compatibility issue, since it'll be closer to what GCC produces anyway.
Yup! those are exactly the issues
Alright sounds like we'll resolve this by simply implementing the DWARFv5 "DW_TAG_variable
for static data members" feature. LLDB should mostly work out of the box with that.
To avoid crashing on older DWARF i think we could add a best effort attempt to detect the cases where we might have a static data member in a union and do the right thing. E.g., by adding back the check for DW_AT_external
. These two solutions won't conflict with each other.
Proposed solution to handle pre-DWARFv5 here: https://github.com/llvm/llvm-project/pull/68300
(on llvm.org lldb build at commit
0bfaed8c612705cfa8c5382d26d8089a0a26386b
)This will infinitely recurse in
ASTContext::getASTRecordLayout
. Possibly the same crash as: