Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

lldb crashes with relatively simple gcc debug info #50785

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR51818
Status CONFIRMED
Importance P normal
Reported by jgorbe@google.com
Reported on 2021-09-10 13:17:30 -0700
Last modified on 2021-11-04 02:24:53 -0700
Version unspecified
Hardware All All
CC dblaikie@gmail.com, jan@jankratochvil.net, jdevlieghere@apple.com, llvm-bugs@lists.llvm.org, manas18244@iiitd.ac.in, teemperor@gmail.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

Building the following program with gcc 10.2.1:

#include <optional>
#include <string>

int main()
{
  std::optional<std::string> os{ "stringy" };
  return 0;
}

and trying to print os from lldb, results in infinite recursion inside clang::ASTContext::getASTRecordLayout leading to a crash.

I have investigated this for a while and I got to the following reduced test case:

template <bool> struct S {};
template <> struct S<false> : S<true> {};
S<false> s;

What happens is that gcc omits DW_TAG_template_parameter in some cases (for unnamed template arguments, it seems). lldb passes clang an AST that says that S is a struct S that inherits from S<>, and clang recurses infinitely trying to compute the class layout. This is the AST as dumped by lldb-test symbols -dump-clang-ast:

Module: cxx17.o S ClassTemplateSpecializationDecl 0xf2571d0 <> struct S definition -DefinitionData pass_in_registers empty aggregate standard_layout trivially_copyable pod trivial literal has_constexpr_non_copy_move_ctor can_const_default_init |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param |-MoveConstructor exists simple trivial needs_implicit |-CopyAssignment simple trivial has_const_param needs_implicit implicit_has_const_param |-MoveAssignment exists simple trivial needs_implicit -Destructor simple irrelevant trivial needs_implicit S ClassTemplateSpecializationDecl 0xf257340 <> struct S definition |-DefinitionData pass_in_registers empty standard_layout trivially_copyable trivial literal has_constexpr_non_copy_move_ctor can_const_default_init | |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | |-MoveConstructor exists simple trivial needs_implicit | |-CopyAssignment simple trivial has_const_param needs_implicit implicit_has_const_param | |-MoveAssignment exists simple trivial needs_implicit | -Destructor simple irrelevant trivial needs_implicit -public 'S<>'

getASTRecordLayout tries to create an EmptySubobjectMap, and its constructor calls ComputeEmptySubobjectSizes. This function enumerates the base classes of the current class, and calls getASTRecordLayout again. Here's a relevant fragment of a stack trace:

[... 6993 more frames of these 3 functions recursing ...]

6994 0x00007ffff5d7a05a in (anonymous namespace)::EmptySubobjectMap::EmptySubobjectMap (this=0x7fffffff3df8, Context=..., Class=0x6f2500)

at /home/jgorbe/code/llvm/clang/lib/AST/RecordLayoutBuilder.cpp:172                                                       

6995 0x00007ffff5d79687 in clang::ASTContext::getASTRecordLayout (this=0x6a5e80, D=0x6f2500)

at /home/jgorbe/code/llvm/clang/lib/AST/RecordLayoutBuilder.cpp:3290

6996 0x00007ffff5d7f5e1 in (anonymous namespace)::EmptySubobjectMap::ComputeEmptySubobjectSizes (this=0x7fffffff4bf8)

at /home/jgorbe/code/llvm/clang/lib/AST/RecordLayoutBuilder.cpp:193

6997 0x00007ffff5d7a05a in (anonymous namespace)::EmptySubobjectMap::EmptySubobjectMap (this=0x7fffffff4bf8, Context=..., Class=0x6f2500)

at /home/jgorbe/code/llvm/clang/lib/AST/RecordLayoutBuilder.cpp:172

6998 0x00007ffff5d79687 in clang::ASTContext::getASTRecordLayout (this=0x6a5e80, D=0x6f2500)

at /home/jgorbe/code/llvm/clang/lib/AST/RecordLayoutBuilder.cpp:3290

6999 0x00007ffff5d7f5e1 in (anonymous namespace)::EmptySubobjectMap::ComputeEmptySubobjectSizes (this=0x7fffffff59f8)

at /home/jgorbe/code/llvm/clang/lib/AST/RecordLayoutBuilder.cpp:193

7000 0x00007ffff5d7a05a in (anonymous namespace)::EmptySubobjectMap::EmptySubobjectMap (this=0x7fffffff59f8, Context=..., Class=0x6f2500)

at /home/jgorbe/code/llvm/clang/lib/AST/RecordLayoutBuilder.cpp:172

7001 0x00007ffff5d79687 in clang::ASTContext::getASTRecordLayout (this=0x6a5e80, D=0x6f2500)

at /home/jgorbe/code/llvm/clang/lib/AST/RecordLayoutBuilder.cpp:3290

7002 0x00007ffff2bfc53f in (anonymous namespace)::CGRecordLowering::CGRecordLowering (this=0x7fffffff60a0, Types=..., D=0x6f2500, Packed=false)

at /home/jgorbe/code/llvm/clang/lib/CodeGen/CGRecordLayoutBuilder.cpp:234

7003 0x00007ffff2bfb8d9 in clang::CodeGen::CodeGenTypes::ComputeRecordLayout (this=0x6c09f8, D=0x6f2500, Ty=0x6c3170)

at /home/jgorbe/code/llvm/clang/lib/CodeGen/CGRecordLayoutBuilder.cpp:879

7004 0x00007ffff267324f in clang::CodeGen::CodeGenTypes::ConvertRecordDeclType (this=0x6c09f8, RD=0x6f2500)

at /home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenTypes.cpp:851

[...]

Quuxplusone commented 3 years ago

Thanks for the detailed bug report!

So the recursion happens because we model both S<false> and S<true> as S<>. We don't have the template arg DIEs for those templates so we don't actually parse any template args or put any into our Clang AST. This is alright in the AST we build directly in LLDB, but when the ASTImporter it tasked with importing both S<> into the expression AST (or any other AST) it tries to merge identical records. In this case it seems it tries to import the false version, but when trying to import the base class (-> the true version), it realizes that it already has an S<> and just reuses that. So then we have the false version with itself as the base class.

The whole EmptySubobjectMap is just trying to calculate the size of S<false> which will infinite recurse as the size depends on the base class (which is itself).

This seems like a pain to fix. I just threw this snippet into GCC:

template<int>
struct F {};

template<class, int>
struct X {};

int main() {
  X<F<1>, 2> x;
  F<2> f;
}

and that's what it gave me (I stripped the parts that aren't relevant):

0x0000002e:   DW_TAG_structure_type
                DW_AT_name  ("X<F<1>, 2>")
                DW_AT_decl_column   (8)

0x00000034:   DW_TAG_structure_type
                DW_AT_name  ("F<2>")
                DW_AT_decl_column   (8)

0x0000005c:     DW_TAG_variable
                  DW_AT_name    ("x")
                  DW_AT_type    (0x0000002e "X<F<1>, 2>")

0x00000068:     DW_TAG_variable
                  DW_AT_name    ("f")
                  DW_AT_type    (0x00000034 "F<2>")

I don't think there is a good way how we can turn X<F<1>, 2> into any reasonable Clang AST. First of all, we would need to roll out our own parser for parsing the template argument string. That parser would have to deal with a lot of compiler-specific complexity and corner-cases (e.g., "How do different compilers name decls in unnamed namespaces?") and probably would really slow down DWARF parsing as we probably need to lookup names while parsing that string. Also there just isn't enough information to actually make an accurate AST. The size of F<1> is not described for example and the same goes for the type of 2 (is it an unsigned, int, char, etc.). We could start making stuff up but that's usually not something that ends well.

The other solution would be that we just make it clear that we can't do anything here. We could synthesise some random __lldb_unknown_template_arg__HASH class and just use that. It won't get merged by accident so it won't cause situations like this. Also it's relatively obvious to the user that we're making stuff up here. This is also kinda complicated as it also requires us to actually have a basic parser for template names to detect when we are in this case, but at least we only need to detect inconsistencies and not actually resolve the template name into actual template argument list.

Quuxplusone commented 3 years ago

Just to avoid confusion: The 'simple' solution where we just make up a fake dummy argument needs the parser as we need to do it in every case where the template parameters don't match the name. GCC will avoid emitting template parameter DIEs for unnamed parameters, but if a template parameter in the same template is named, it would emit the DIE for that parameter. It doesn't give us any information though that the DIEs are not describing all parameters from what I can see.

For example:

0x0000002e:   DW_TAG_structure_type
                DW_AT_name  ("X<F<1>, 2>")
                DW_AT_byte_size (0x01)
                DW_AT_decl_file ("/home/teemperor/test/temp.cpp")
                DW_AT_decl_line (5)
                DW_AT_decl_column   (0x08)
                DW_AT_sibling   (0x00000044)

0x0000003b:     DW_TAG_template_value_parameter
                  DW_AT_name    ("i")
                  DW_AT_type    (0x00000074 "int")
                  DW_AT_const_value (0x02)

Second parameter is named, first one isn't. So we would need to parse the name, calculate the number of parameters and then match it with the template argument number inferred from the DIEs (e.g., template parameter DIEs + parameter pack values).

Quuxplusone commented 3 years ago

Presumably LLDB does /some/ of this work to match template declarations (where clang omits all the DW_TAG_template_parameters) with definitions (where it includes them all), though? Or does it match only by exact DW_AT_name in that case? Exact DW_AT_name matching is pretty narrow & won't be compatible with GCC even in cases where all parameters are described.

As for guessing: At least for the template type parameters (the common case) - I'd expect the solution there isn't to guess, but to do name lookup on the string and find the type definition in the DWARF that way.

For non-type template parameters, some cases are unambiguous (true/false -> bool seems reasonable), some are definitely ambiguous (the worst case is "auto" C++17 parameters where you get different template instantiations for x<1> and x<1u> (without auto, at least x<1> and x<1u> are the same type - so guessing which integer type is probably actually OK-ish at least for GCC<>GCC - until you get cases where it's named sometimes and not named other times, as you say)

Given some folks still use GCC a lot, there's probably some folks that care about LLDB support for GCC's debug info - so some solution would be good to find. Probably wouldn't hurt to talk to the GCC folks about what, if anything, they're willing to add here. At least adding type suffixes to literal template parameters /might/ be enough. (2u, 2ll, etc - that's what Clang does)

But maybe we come around to "it's worth it to put DW_TAG_template_parameters on all template parameters, even unnamed ones, and even on declarations" (to widen the scope a bit beyond this bug) to make it easier to structurally match template names across CUs/compilers.

Quuxplusone commented 3 years ago

Oh, if we were able to treat the original thing as being called "S<>" - what if we just treated it as being called "S" without any template parameters? Would that be at least a workaround/somewhat implement GCC<>GCC compatibility? I guess it breaks down when /some/ of the parameters are named and some aren't, though...

Quuxplusone commented 3 years ago

_Bug 52393 has been marked as a duplicate of this bug._