llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.19k stars 12.04k forks source link

'<Unable to determine byte size.>' in LLDB for `_BitInt`s larger than 128. #110273

Open samw-improbable opened 1 month ago

samw-improbable commented 1 month ago

When I compile the code below with clang main.cc -o main --std=c++23 -stdlib=libc++ -lc++ -g -O0.

#include <iostream>

int main(int argc, char **argv) {
  _BitInt(64) i64 = 42;   // Works
  _BitInt(128) i128 = 42; // Works
  _BitInt(129) i129 = 42; // LLDB says '<Unable to determine byte size.>'
  _BitInt(256) i256 = 42; // LLDB says '<Unable to determine byte size.>'

  std::cout << "Hello, world!" << std::endl;
}

When I inspect these variables with LLDB, I get <Unable to determine byte size.> for the _BitInts with a size larger than 128.

(lldb) target create "main"
Current executable set to '/home/sam/dev/lldb-bug/main' (x86_64).
(lldb) b main
Breakpoint 1: where = main`main + 15 at main.cc:4:15, address = 0x000000000000123f
(lldb) r
Process 1784543 launched: '/home/sam/dev/lldb-bug/main' (x86_64)
Process 1784543 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
    frame #0: 0x000055555555523f main`main(argc=1, argv=0x00007fffffffddd8) at main.cc:4:15
   1    #include <iostream>
   2   
   3    int main(int argc, char **argv) {
-> 4      _BitInt(64) i64 = 42;   // Works
   5      _BitInt(128) i128 = 42; // Works
   6      _BitInt(129) i129 = 42; // LLDB says '<Unable to determine byte size.>'
   7      _BitInt(256) i256 = 42; // LLDB says '<Unable to determine byte size.>'
(lldb) frame variable
(int) argc = 1
(char **) argv = 0x00007fffffffddd8
(long) i64 = 140737353812080
(__int128) i128 = 115647419195831250335205442705798252522
(void) i129 = <Unable to determine byte size.>

(void) i256 = <Unable to determine byte size.>

I'm working in a codebase that makes wide use of _BitInt(256), so fixing this would be super helpful! Or even if there's some plugin or something I can use with LLDB as a temporary fix would be much appreciated.

I've reproduced this bug in versions 18.1.8 and 19.1.0 of both clang and lldb.

llvmbot commented 1 month ago

@llvm/issue-subscribers-lldb

Author: Sam W (samw-improbable)

When I compile the code below with `clang main.cc -o main --std=c++23 -stdlib=libc++ -lc++ -g -O0`. ```c++ #include <iostream> int main(int argc, char **argv) { _BitInt(64) i64 = 42; // Works _BitInt(128) i128 = 42; // Works _BitInt(129) i129 = 42; // LLDB says '<Unable to determine byte size.>' _BitInt(256) i256 = 42; // LLDB says '<Unable to determine byte size.>' std::cout << "Hello, world!" << std::endl; } ``` When I inspect these variables with LLDB, I get `<Unable to determine byte size.>` for the `_BitInt`s with a size larger than 128. ``` (lldb) target create "main" Current executable set to '/home/sam/dev/lldb-bug/main' (x86_64). (lldb) b main Breakpoint 1: where = main`main + 15 at main.cc:4:15, address = 0x000000000000123f (lldb) r Process 1784543 launched: '/home/sam/dev/lldb-bug/main' (x86_64) Process 1784543 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000055555555523f main`main(argc=1, argv=0x00007fffffffddd8) at main.cc:4:15 1 #include <iostream> 2 3 int main(int argc, char **argv) { -> 4 _BitInt(64) i64 = 42; // Works 5 _BitInt(128) i128 = 42; // Works 6 _BitInt(129) i129 = 42; // LLDB says '<Unable to determine byte size.>' 7 _BitInt(256) i256 = 42; // LLDB says '<Unable to determine byte size.>' (lldb) frame variable (int) argc = 1 (char **) argv = 0x00007fffffffddd8 (long) i64 = 140737353812080 (__int128) i128 = 115647419195831250335205442705798252522 (void) i129 = <Unable to determine byte size.> (void) i256 = <Unable to determine byte size.> ``` I'm working in a codebase that makes wide use of `_BitInt(256)`, so fixing this would be super helpful! Or even if there's some plugin or something I can use with LLDB as a temporary fix would be much appreciated. I've reproduced this bug in versions 18.1.8 and 19.1.0 of both clang and lldb.
Michael137 commented 1 month ago

Currently _BitInt(N) is represented in DWARF as a DW_AT_base_type entry with a DW_AT_byte_size which is at least N bytes (but need not be exactly N, e.g., for 129 it's rounded to 0x18). E.g., for _BitInt(129) and _BitInt(256) we generated:

 DW_TAG_base_type                              
   DW_AT_name      ("_BitInt")                 
   DW_AT_encoding  (DW_ATE_signed)             
   DW_AT_byte_size (0x18)                      

 DW_TAG_base_type                              
   DW_AT_name      ("_BitInt")                 
   DW_AT_encoding  (DW_ATE_signed)             
   DW_AT_byte_size (0x20)                      

When we reconstruct types from DWARF in LLDB, we try to figure out which builtin-type this corresponds to based on the size: https://github.com/llvm/llvm-project/blob/023f6910cfcbe7b3a9c405f59b3e62728617eeaf/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp#L966-L1011

E.g., for _BitInt(128) we manage to match it against Clang's Int128Ty. For anything larger we don't find a corresponding builtin-type and pretend the variable is of type void. Which is why we fail to get it's byte-size later on.

We could add another case for _BitInt and create a CompilerType whose underlying type is clang::BitIntType. We would still be lying because the size we get in DWARF aren't the ones spelled out in source necessarily. But that's the best we could do with the information available.

There is a DWARFv6 proposal for a new encoding for bit-precise integer types (https://dwarfstd.org/issues/230529.1.html), which would make it more convenient for LLDB to determine that we're dealing with such types. But this still requires Clang to generate a DW_AT_base_type with a DW_AT_bit_size. Which we currently don't do, but probably should?

TL;DR, we could work around this in LLDB to some extent, but Clang should really be generating these types with DW_AT_bit_size. The rounding comes from the call to ASTContext::geTypeSize here: https://github.com/llvm/llvm-project/blob/5d08f3256b134e9c5414b4e50563e5de0f1735c6/clang/lib/CodeGen/CGDebugInfo.cpp#L1002

Michael137 commented 1 month ago

Related Clang discussion: https://github.com/llvm/llvm-project/issues/61952

samw-improbable commented 1 month ago

@Michael137 Thanks for the prompt response!

I think I understand what you're talking about, so it sounds like the main fix is to

But in the meanwhile, LLDB could just use DW_at_byte_size, but this is potentially inaccurate for _BitInt(N) where N isn't a multiple of 8 (but that's probably still better than the current case of it just not working).

I'm not sure I understand why the DWARFv6 encoding is needed? Sounds like it's just a more concise way of encoding the same information?

Also, just another thought: One of the annoying things about the <Unable to determine byte size.> is that, as far as I can tell, I can't actually inspect the value in any reasonable way, any attempt to print the address or the raw bytes underneath results in a failure to materialize type. I think it could be worth generally having a more graceful failure case for when LLDB can't figure out the type, so you can still allow basic operations like getting the memory address?

Michael137 commented 1 month ago

I think I understand what you're talking about, so it sounds like the main fix is to

Have clang report bitsize with DW_AT_bit_size Have LLDB handle this new info But in the meanwhile, LLDB could just use DW_at_byte_size, but this is potentially inaccurate for _BitInt(N) where N isn't a multiple of 8 (but that's probably still better than the current case of it just not working).

Yup that's exactly it. Though emitting DW_AT_bit_size here might require more discussion. E.g., see the thread in https://github.com/llvm/llvm-project/pull/69741.

So I think the best immediate next step would be to teach LLDB to create BitIntTypes by checking if the typename is _BintInt in TypeSystemClang::GetBuiltinTypeForDWARFEncodingAndBitSize. And use the bit_size we got from DWARF as the BitInt size (despite it being incorrect for non-multiples of 8).

I'm not sure I understand why the DWARFv6 encoding is needed? Sounds like it's just a more concise way of encoding the same information?

It would just be a more stable/convenient way of determining that we're dealing with a BitInt. We would have to parse the DW_AT_name, etc.

But yea, it wouldn't help with your issue necessarily.

Also, just another thought: One of the annoying things about the is that, as far as I can tell, I can't actually inspect the value in any reasonable way, any attempt to print the address or the raw bytes underneath results in a failure to materialize type. I think it could be worth generally having a more graceful failure case for when LLDB can't figure out the type, so you can still allow basic operations like getting the memory address?

Hmm would mem read &i256 work here as a workaround? But maybe we could do better?

samw-improbable commented 1 month ago

Okay cool that all makes sense. If I ever get some time maybe I'll take a look at opening a PR myself!

Also regarding mem read &i256, I just get this:

(lldb) mem read &i256
error: invalid start address expression.
error: address expression "&i256" evaluation failed

Which is a different error to simply printing it

(lldb) p &i256
error: Couldn't materialize: couldn't get the value of variable i256: Unable to determine byte size.
error: errored out in DoExecute, couldn't PrepareToExecuteJITExpression

But maybe it's the same error under the hood? Not sure.