Closed al13n321 closed 4 months ago
Can lldb and/or gdb correctly display the parameter?
I think we need to understand exactly what is going on before doing a workaround like this. I'll spend some time on it later.
lldb: (DB::ActionsDAG::NodeRawConstPtrs) nodes = <extracting data from value failed>
gdb:
nodes = {
__begin_ = 0x7ffe90cc2088,
__end_ = 0x0,
__end_cap_ = {
<std::__1::__compressed_pair_elem<DB::ActionsDAG::Node const**, 0, false>> = {
__value_ = 0x0
},
<std::__1::__compressed_pair_elem<std::__1::allocator<DB::ActionsDAG::Node const*>, 1, true>> = {
<std::__1::allocator<DB::ActionsDAG::Node const*>> = {
<std::__1::__non_trivial_if<true, std::__1::allocator<DB::ActionsDAG::Node const*> >> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}
}
I.e. they both got the 8-byte value, lldb refused to extend it to 24 bytes, and gdb zero-padded it to 24 bytes. With this workaround I'm getting the correct value (valid std::vector with expected contents).
Thanks. I think that confirms it is an LLVM bug.
I found this in the LLVM tests:
# This becomes a problem when values move onto the stack and we emit
# DW_OP_deref: there is no information about how large a value the consumer
# should load from the stack. The convention today appears to be the size of
# the variable, ...
which is at odds with what the DWARF V5 spec says in 2.5.1.3:
The DW_OP_deref operation pops the top stack entry and treats it as an
address. The popped value must have an integral type. The value retrieved
from that address is pushed, and has the generic type. The size of the data
retrieved from the dereferenced address is the size of an address on the target
machine.
https://github.com/llvm/llvm-project/issues/64093 is a similar problem where the deref value is large, but for DW_OP_deref_size
.
It seems like LLVM is using DW_OP_stack_value
for variables that have simply been spilled to the stack (that's the program stack, not the DWARF stack), which makes no sense to me: if it is spilled to the stack then that's its new location, no need to treat it as an implicit location.
Good find. Sounds like this doesn't belong in gimli then. Or maybe it could be something on the side, e.g. gimli::quirks::preprocess_expression(Expression) -> Expression
.
Moved the workaround into my code instead (which already accumulated ~7 similar workarounds for debug info quirks, not sure even why I tried adding this particular one to gimli instead, sorry for taking your time; but I appreciate the confirmation that the problem is in LLVM!).
I've seen this situation a few times (clang-18, x64, Linux): variable's type is 24-byte struct (std::vector), but its location expression ends with
DW_OP_deref, DW_OP_stack_value
. I.e. the expression tells us that the value of a 24-byte struct is the result of reading 8 bytes (where 8 is address size) from memory; that's indeed what gimli does. I couldn't figure out what's LLVM intended when emitting such expression.This PR adds a workaround for this: if the expression ends with
[DW_OP_deref, DW_OP_stack_value]
, pretend those two instructions are not there. I.e. assume the whole value is available at the address that would have been dereferenced.(If this workaround doesn't belong in gimli, that's ok, I can just do the same in my code instead. Merge this only if it seems useful for other users.)
Appendix: example of clang output with this problem.
Here's debug info about a variable (produced by clang-18, x64, Linux):
So the type is a 24-byte struct.
The first location is
DW_OP_breg4 RSI+0
, which makes sense as this variable is the first argument of the function.DW_OP_breg4
pushes rsi value onto the dwarf stack, then, by convention, the final value at top of the stack is the address of the variable, i.e.&nodes
.The second location starts at pc 0x0000000011db5c1c. The instruction just before that is:
So, the address of the struct is written to the stack at
[rbp-0F8h]
(0F8h = 248), and then the location in dwarf changes. Makes sense.But the new location
DW_OP_breg6 RBP-248, DW_OP_deref_size 0x8, DW_OP_deref, DW_OP_stack_value
seems to say:DW_OP_breg6 RBP-248
- pushRBP-248
(akarbp-0f8h
) to the dwarf stack. We know that[rbp-0F8h]
is the address of the struct, so[rbp-0F8h]
is address of address,&&nodes
. Makes sense.DW_OP_deref_size 0x8
- dereference it, placing[rbp-0f8h]
at the top of dwarf stack. That's&nodes
. Makes sense.