Open samw-improbable opened 1 month ago
@llvm/issue-subscribers-lldb
Author: Sam W (samw-improbable)
Currently _BitInt(N)
is represented in DWARF as a DW_AT_base_type
entry with a DW_AT_byte_size
which is at least N
bytes (but need not be exactly N
, e.g., for 129 it's rounded to 0x18
). E.g., for _BitInt(129)
and _BitInt(256)
we generated:
DW_TAG_base_type
DW_AT_name ("_BitInt")
DW_AT_encoding (DW_ATE_signed)
DW_AT_byte_size (0x18)
DW_TAG_base_type
DW_AT_name ("_BitInt")
DW_AT_encoding (DW_ATE_signed)
DW_AT_byte_size (0x20)
When we reconstruct types from DWARF in LLDB, we try to figure out which builtin-type this corresponds to based on the size: https://github.com/llvm/llvm-project/blob/023f6910cfcbe7b3a9c405f59b3e62728617eeaf/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp#L966-L1011
E.g., for _BitInt(128)
we manage to match it against Clang's Int128Ty
. For anything larger we don't find a corresponding builtin-type and pretend the variable is of type void
. Which is why we fail to get it's byte-size later on.
We could add another case for _BitInt
and create a CompilerType
whose underlying type is clang::BitIntType
. We would still be lying because the size we get in DWARF aren't the ones spelled out in source necessarily. But that's the best we could do with the information available.
There is a DWARFv6 proposal for a new encoding for bit-precise integer types (https://dwarfstd.org/issues/230529.1.html), which would make it more convenient for LLDB to determine that we're dealing with such types. But this still requires Clang to generate a DW_AT_base_type
with a DW_AT_bit_size
. Which we currently don't do, but probably should?
TL;DR, we could work around this in LLDB to some extent, but Clang should really be generating these types with DW_AT_bit_size
. The rounding comes from the call to ASTContext::geTypeSize
here: https://github.com/llvm/llvm-project/blob/5d08f3256b134e9c5414b4e50563e5de0f1735c6/clang/lib/CodeGen/CGDebugInfo.cpp#L1002
Related Clang discussion: https://github.com/llvm/llvm-project/issues/61952
@Michael137 Thanks for the prompt response!
I think I understand what you're talking about, so it sounds like the main fix is to
DW_AT_bit_size
But in the meanwhile, LLDB could just use DW_at_byte_size
, but this is potentially inaccurate for _BitInt(N)
where N
isn't a multiple of 8 (but that's probably still better than the current case of it just not working).
I'm not sure I understand why the DWARFv6 encoding is needed? Sounds like it's just a more concise way of encoding the same information?
Also, just another thought: One of the annoying things about the <Unable to determine byte size.>
is that, as far as I can tell, I can't actually inspect the value in any reasonable way, any attempt to print the address or the raw bytes underneath results in a failure to materialize type. I think it could be worth generally having a more graceful failure case for when LLDB can't figure out the type, so you can still allow basic operations like getting the memory address?
I think I understand what you're talking about, so it sounds like the main fix is to
Have clang report bitsize with DW_AT_bit_size Have LLDB handle this new info But in the meanwhile, LLDB could just use DW_at_byte_size, but this is potentially inaccurate for _BitInt(N) where N isn't a multiple of 8 (but that's probably still better than the current case of it just not working).
Yup that's exactly it. Though emitting DW_AT_bit_size
here might require more discussion. E.g., see the thread in https://github.com/llvm/llvm-project/pull/69741.
So I think the best immediate next step would be to teach LLDB to create BitIntType
s by checking if the typename is _BintInt
in TypeSystemClang::GetBuiltinTypeForDWARFEncodingAndBitSize
. And use the bit_size
we got from DWARF as the BitInt
size (despite it being incorrect for non-multiples of 8).
I'm not sure I understand why the DWARFv6 encoding is needed? Sounds like it's just a more concise way of encoding the same information?
It would just be a more stable/convenient way of determining that we're dealing with a BitInt
. We would have to parse the DW_AT_name
, etc.
But yea, it wouldn't help with your issue necessarily.
Also, just another thought: One of the annoying things about the
is that, as far as I can tell, I can't actually inspect the value in any reasonable way, any attempt to print the address or the raw bytes underneath results in a failure to materialize type. I think it could be worth generally having a more graceful failure case for when LLDB can't figure out the type, so you can still allow basic operations like getting the memory address?
Hmm would mem read &i256
work here as a workaround? But maybe we could do better?
Okay cool that all makes sense. If I ever get some time maybe I'll take a look at opening a PR myself!
Also regarding mem read &i256
, I just get this:
(lldb) mem read &i256
error: invalid start address expression.
error: address expression "&i256" evaluation failed
Which is a different error to simply printing it
(lldb) p &i256
error: Couldn't materialize: couldn't get the value of variable i256: Unable to determine byte size.
error: errored out in DoExecute, couldn't PrepareToExecuteJITExpression
But maybe it's the same error under the hood? Not sure.
When I compile the code below with
clang main.cc -o main --std=c++23 -stdlib=libc++ -lc++ -g -O0
.When I inspect these variables with LLDB, I get
<Unable to determine byte size.>
for the_BitInt
s with a size larger than 128.I'm working in a codebase that makes wide use of
_BitInt(256)
, so fixing this would be super helpful! Or even if there's some plugin or something I can use with LLDB as a temporary fix would be much appreciated.I've reproduced this bug in versions 18.1.8 and 19.1.0 of both clang and lldb.