llvmbot commented 6 years ago


Bugzilla Link	39716
Version	trunk
OS	All
Blocks	llvm/llvm-project#38116
Reporter	LLVM Bugzilla Contributor
CC	@asl,@adrian-prantl,@dstenb,@dwblaikie,@djtodoro,@echristo,@fredriss,@JDevlieghere,@jmorse,@walkerkd,@pogo59,@rnk,@tbosch,@vedantk

Extended Description

thread #1, queue = 'com.apple.main-thread', stop reason = step over frame #0: 0x0000000100000f84 a.out`fn2(a=, b=, c=) at optout.c:12:10 [opt] 9 fn2 (long int a, long int b, long int c) 10 { 11 long int q = 2 * a; -> 12 return fn1 (5, 6, 7); 13 } 14
15 int main(void) { (lldb) frame var (long) a = <no location, value may have been optimized out>

(long) b = <no location, value may have been optimized out>

(long) c = <no location, value may have been optimized out>

(long) q = <no location, value may have been optimized out>

A gcc-built binary debugged with gdb yields:

Breakpoint 1, main () at opt.c:16 16 return fn2(14, 23, 34); (gdb) s fn2 (c=34, b=23, a=14) at opt.c:12 12 return fn1 (5, 6, 7); (gdb) info args c = 34 b = 23 a = 14

Testcase:

attribute((noinline)) int fn1 (long int x, long int y, long int z) { int l = x 2; int q = y z; return l + q; }

attribute((noinline)) long int fn2 (long int a, long int b, long int c) { long int q = 2 * a; return fn1 (5, 6, 7); }

int main(void) { return fn2(14, 23, 34); }

tbosch commented 3 years ago

mentioned in issue llvm/llvm-bugzilla-archive#45406

djtodoro commented 4 years ago

The call site info (and entry values as well, since it depends on it) should be avoided for inlined calls. We should support only normal, tail and indirect calls (according to DWARF standard).

dwblaikie commented 4 years ago

Ah, thanks.

With Djordje's patch the DIEs looks like this:

0x00000036: DW_TAG_subprogram DW_AT_low_pc (0x0000000000401110) DW_AT_high_pc (0x000000000040111d) DW_AT_abstract_origin (0x00000053 "_Z2f1i")

0x00000042: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value) DW_AT_abstract_origin (0x0000005d "i")

0x0000004c: DW_TAG_variable DW_AT_location (indexed (0x0) loclist = 0x00000014: [0x0000000000401116, 0x000000000040111c): DW_OP_reg0 RAX) DW_AT_abstract_origin (0x00000065 "y")

0x00000052: NULL

0x00000053: DW_TAG_subprogram DW_AT_linkage_name ("_Z2f1i") DW_AT_name ("f1") DW_AT_inline (DW_INL_inlined)

0x0000005d: DW_TAG_formal_parameter DW_AT_name ("i")

0x00000065: DW_TAG_variable DW_AT_name ("y")

0x0000006e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000401120) DW_AT_high_pc (0x000000000040112d) DW_AT_name ("main")

0x0000007d: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x00000053 "_Z2f1i") DW_AT_low_pc (0x0000000000401120) DW_AT_high_pc (0x0000000000401129)

0x0000008a: DW_TAG_formal_parameter DW_AT_abstract_origin (0x0000005d "i")

So the DW_TAG_formal_parameter seems fine even with inlining, but there is no DW_TAG_call_site_parameter.

Yeah, there shouldn't be a call_site in this case - because the call has been inlined. The DW_AT_location on the the 0x8a DW_TAG_formal_parameter should have a DW_AT_location with the literal 23.

If the Dead Argument Elimination happens before inlining - then the only place the 23 would remain alive would be in is whatever the IR Metadata is for the call site - and would have to be transformed into the inlined parameter's location (assuming the location isn't clobbered) - essentially collapsing the entry_value + call_site_parameter down into a location. (inlining the parameter value, essentially)

I.e. debugging looks like this when breaking on main:

(lldb) r Process 2211681 launched: '/usr/local/google/home/tbosch/llvm-project/build/a.out' (x86_64) Process 2211681 stopped

thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000000000401120 a.out`main [inlined] f1(i=) at test2.cc:4:11 1 int x = 12; 2 3 int f1(int i) { -> 4 int y = x; 5 y = y*y; 6 return y+1; 7 }

tbosch commented 4 years ago

Ah, thanks.

With Djordje's patch the DIEs looks like this:

0x00000036: DW_TAG_subprogram DW_AT_low_pc (0x0000000000401110) DW_AT_high_pc (0x000000000040111d) DW_AT_abstract_origin (0x00000053 "_Z2f1i")

0x00000042: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value) DW_AT_abstract_origin (0x0000005d "i")

0x0000004c: DW_TAG_variable DW_AT_location (indexed (0x0) loclist = 0x00000014: [0x0000000000401116, 0x000000000040111c): DW_OP_reg0 RAX) DW_AT_abstract_origin (0x00000065 "y")

0x00000052: NULL

0x00000053: DW_TAG_subprogram DW_AT_linkage_name ("_Z2f1i") DW_AT_name ("f1") DW_AT_inline (DW_INL_inlined)

0x0000005d: DW_TAG_formal_parameter DW_AT_name ("i")

0x00000065: DW_TAG_variable DW_AT_name ("y")

0x0000006e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000401120) DW_AT_high_pc (0x000000000040112d) DW_AT_name ("main")

0x0000007d: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x00000053 "_Z2f1i") DW_AT_low_pc (0x0000000000401120) DW_AT_high_pc (0x0000000000401129)

0x0000008a: DW_TAG_formal_parameter DW_AT_abstract_origin (0x0000005d "i")

So the DW_TAG_formal_parameter seems fine even with inlining, but there is no DW_TAG_call_site_parameter. I.e. debugging looks like this when breaking on main:

(lldb) r Process 2211681 launched: '/usr/local/google/home/tbosch/llvm-project/build/a.out' (x86_64) Process 2211681 stopped

thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000000000401120 a.out`main [inlined] f1(i=) at test2.cc:4:11 1 int x = 12; 2 3 int f1(int i) { -> 4 int y = x; 5 y = y*y; 6 return y+1; 7 }

dwblaikie commented 4 years ago

Djordje, is it possible that your patch that fills the DW_AT_location of unused DW_TAG_formal_parameter doesn't work for inlined functions?

Example (f1 contains multiple statements so we can step at least 2x inside of it):

4> int x = 12;

int f1(int i) { int y = x; y = y*y; return y+1; }

int main(void) { return f1(23)+1; }

DWARF: 0x00000053: DW_TAG_subprogram DW_AT_name ("f1") DW_AT_inline (DW_INL_inlined)

0x0000005d: DW_TAG_formal_parameter DW_AT_name ("i")

inlined variables are quite different to real parameters - in this case the inlined_subroutine's formal_parameter should have a location with the constant value 23 (. No need for call_site_value/entry_value stuff.

For the non-inlined copy of 'f1'. you might not be looking at the right DIEs

You've quoted the abstract subprogram (this contains the common attributes used across all inline and non-inline copies of the function (this distinction only exists for functions that have at least one inlining)) - it has no locations for the function (high/low/ranges) or its variables.

If you look for the subprogram that refers to the one you've quoted (0x53) as its abstract_origin, you'll find something like this:

0x00000046: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000000) DW_AT_high_pc (0x000000000000000d) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_GNU_all_call_sites (true) DW_AT_abstract_origin (0x00000068 "_Z2f1i")

0x00000059: DW_TAG_formal_parameter DW_AT_abstract_origin (0x00000078 "i")

0x0000005e: DW_TAG_variable DW_AT_location (0x00000000: [0x0000000000000006, 0x000000000000000c): DW_OP_reg0 RAX) DW_AT_abstract_origin (0x00000083 "y")

(this is without Djordje's patch applied - perhaps with that patch applied the formal_parameter DIE here does have a location using entry_value?)

tbosch commented 4 years ago

Djordje, is it possible that your patch that fills the DW_AT_location of unused DW_TAG_formal_parameter doesn't work for inlined functions?

Example (f1 contains multiple statements so we can step at least 2x inside of it):

int x = 12;

int f1(int i) { int y = x; y = y*y; return y+1; }

int main(void) { return f1(23)+1; }

DWARF: 0x00000053: DW_TAG_subprogram DW_AT_name ("f1") DW_AT_inline (DW_INL_inlined)

0x0000005d: DW_TAG_formal_parameter DW_AT_name ("i")

djtodoro commented 4 years ago

Creating the DW_TAG_call_site_parameter tags seems to have more than the bug above (that the DW_AT_call_value is incorrect):

If the last function of a statement has unused parameters, then the corresponding DW_TAG_call_site only contains DW_TAG_call_site_parameter for the non optimized out parameters. I.e. the number of DW_TAG_call_site_parameter does not match the number of DW_TAG_formal_parameter in this case.

I don't think its a bug. There are cases where we do not express all the call-site params values and it is OK. We ended up describing target dependent instructions that load param value, very late in the pipeline, in terms of DWARF and it ends up in call_val (or call_site_param_val) DWARF attribute. There is support for some instructions, but there is place for improvements. If we don't handle some instructions, we won't generate corresponding call_val. Here is an experimental project that helps in finding these opportunities for improvements: https://github.com/djolertrk/call-site-params-stats.

== About the case from this thread

The conclusion is that the call value is wrong in this case. The call value (within main()) depends on entry_val operation, but the register value (from the entry_val expr) is not alive at the entry point of the fn2(). Maybe we need something like this: https://reviews.llvm.org/D85111 (quick-made-patch; I need to think a bit about it...)

Anyhow, why didn't we generate call_values with corresponding constants? The main() function before Dead Arg Eliminitaion looks as:

define dso_local i32 @main() local_unnamed_addr #1 !dbg !36 { entry: %call = call i64 @fn2(i64 14, i64 23, i64 34), !dbg !39 %conv = trunc i64 %call to i32, !dbg !39 ret i32 %conv, !dbg !40 }

After the pass: define dso_local i32 @main() local_unnamed_addr #1 !dbg !36 { entry: %call = call i64 @fn2(i64 undef, i64 undef, i64 undef), !dbg !39 %conv = trunc i64 %call to i32, !dbg !39 ret i32 %conv, !dbg !40 }

so, that is the last spot the call_values are available. Furthermore, at the end, we ended up having machine function as:

Machine code for function main: NoPHIs, TracksLiveness, NoVRegs, TiedOpsRewritten

bb.0.entry: TAILJMPd64 @fn2, <regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...>, implicit $rsp, implicit $ssp, implicit $rsp, implicit $ssp, implicit undef $rdi, implicit undef $rsi, implicit undef $rdx, debug-location !38; test.c:16:10

End machine code for function main.

and there is no instruction that can help us with describing the call value in terms of DWARF, at the end of pipeline (so with the current approach, we can't describe it).

tbosch commented 4 years ago

Creating the DW_TAG_call_site_parameter tags seems to have more than the bug above (that the DW_AT_call_value is incorrect):

If the last function of a statement has unused parameters, then the corresponding DW_TAG_call_site only contains DW_TAG_call_site_parameter for the non optimized out parameters. I.e. the number of DW_TAG_call_site_parameter does not match the number of DW_TAG_formal_parameter in this case.

Example:

attribute((noinline)) int fn2(int a, int b, int c) { return a*a; }

int main(void) { int x = fn2(1,2,3) fn2(4,5,6); return xx; }

Debug info:

0x00000051: DW_TAG_subprogram DW_AT_name ("main")

0x00000069: DW_TAG_call_site DW_AT_call_origin (0x00000027) DW_AT_call_return_pc (0x000000000040112b)

0x0000006f: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_call_value (DW_OP_lit1)

0x00000074: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_entry_value(DW_OP_reg4 RSI))

0x0000007b: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg1 RDX) DW_AT_call_value (DW_OP_entry_value(DW_OP_reg1 RDX))

0x00000083: DW_TAG_call_site DW_AT_call_origin (0x00000027) DW_AT_call_return_pc (0x0000000000401137)

0x00000089: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_call_value (DW_OP_lit4)

tbosch commented 4 years ago

Sorry, this part of the previous comment should be ignored:

In the corresponding DW_TAG_call_site in main: 0x000000d3: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_lit23)

tbosch commented 4 years ago

I looked at this a bit more. Using Djordje's patch that gets the DW_TAG_formal_parameter right, and a local hack to clang that always uses DW_AT_call_value (DW_OP_lit23) for all DW_TAG_call_site_parameter I was able to verify that once we fix the debug info, lldb (-O3 -gdwarf-5) and gdb (-O3 -g) will do the right thing and show the parameters as no longer "optimized out" but with the correct value from debug info.

Debug info:

0x00000072: DW_TAG_subprogram DW_AT_name ("fn2")

0x00000082: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value) DW_AT_name ("a")

0x0000008f: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value) DW_AT_name ("b")

0x0000009c: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg1 RDX), DW_OP_stack_value) DW_AT_name ("c")

0x000000aa: DW_TAG_subprogram DW_AT_name ("main")

0x000000c2: DW_TAG_call_site DW_AT_call_origin (0x00000072) DW_AT_call_return_pc (0x0000000000401136)

0x000000c8: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_call_value (DW_OP_lit23)

0x000000cd: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_lit23)

0x000000d2: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg1 RDX) DW_AT_call_value (DW_OP_lit23)

0x00000095: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value) DW_AT_name ("b")

In the corresponding DW_TAG_call_site in main: 0x000000d3: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_lit23)

Debugger output when stepping into f1:

(lldb) s Process 571769 stopped

thread #1, name = 'a.out', stop reason = step in frame #0: 0x0000000000401120 a.out`fn2(a=23, b=23, c=23) at test.cc:11:10

(gdb) s fn2 (a=23, b=23, c=23) at test.cc:11

tbosch commented 4 years ago

Djordje, I tried your patch and it does produce the correct DW_AT_location in the DW_TAG_formal_parameter of "fn2".

Do you have an idea how to fix the invalid values in the DW_AT_call_value of the DW_TAG_call_site_parameter tags of "main"?

djtodoro commented 4 years ago

Hi,

gdb uses the caller info (DW_TAG_call_site/call_site_param) to print @entry (special value that represents the value of parameter at entry point in the function) info in backtraces. Furthermore, if we generate a location entry with DW_OP_entry_value within callee, it will allow us to use the @entry value as actual value in the callee (typically used for unmodified params, but it can be used for modified as well, if we can express the modification in terms of its entry value). So, the debug entry values feature has two purposes/uses indeed.

lldb does not print @entry in backtraces at all, but it uses it in the situation when DW_OP_entry_value is available (so we need both, caller and callee info at the time) to retrieve the actual value of a parameter when possible.

The [0] is the proposal for how to improve this situation within LLVM. Currently, we generate DW_OP_entry_values inputs within .debug_loc only within LiveDebugValues, but it seems we can use it in DeadArgElim pass as well for unused args.

In this particular test case, I see that we have incorrect info at caller side... (we've jumped into corner case now, the DW_OP_entry_value within call_value attribute of the corresponding dw-tag-param depends on register-value which is optimized out, and we need to fix it; (special case: DW_OP_entry_value can be used at caller side as well; within some cases when trying to express param value at the call site)).

[0] https://reviews.llvm.org/D85012

dwblaikie commented 4 years ago

This produces the following debug info:

0x0000006a: DW_TAG_subprogram DW_AT_name ("fn2")

0x0000007a: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value) DW_AT_name ("a")

0x00000087: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value) DW_AT_name ("b")

0x00000094: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg1 RDX), DW_OP_stack_value) DW_AT_name ("c")

0x000000c0: DW_TAG_subprogram DW_AT_name ("main")

0x000000cf: DW_TAG_call_site DW_AT_call_origin (0x0000006a) DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000401140)

0x000000d5: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_call_value (DW_OP_constu 0x22)

0x000000db: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_lit23)

0x000000e0: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg1 RDX) DW_AT_call_value (DW_OP_lit14)

However, when debugging with lldb (from head), it still shows that these parameters are optimized out:

(lldb) s Process 58604 stopped

thread #1, name = 'a.out', stop reason = step in frame #0: 0x0000000000401120 a.out`fn2(a=, b=, c=) at call_site.c:11:10

if you happen to figure out why lldb thinks they're unavailable - I'm all ears.

I'll probably look into why gdb can handle other cases but decides this one isn't acceptable... but that might be a bit beyond my gdb understanding.

tbosch commented 4 years ago

This produces the following debug info:

0x0000006a: DW_TAG_subprogram DW_AT_name ("fn2")

0x0000007a: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value) DW_AT_name ("a")

0x00000087: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value) DW_AT_name ("b")

0x00000094: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg1 RDX), DW_OP_stack_value) DW_AT_name ("c")

0x000000c0: DW_TAG_subprogram DW_AT_name ("main")

0x000000cf: DW_TAG_call_site DW_AT_call_origin (0x0000006a) DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000401140)

0x000000d5: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_call_value (DW_OP_constu 0x22)

0x000000db: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_lit23)

0x000000e0: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg1 RDX) DW_AT_call_value (DW_OP_lit14)

However, when debugging with lldb (from head), it still shows that these parameters are optimized out:

(lldb) s Process 58604 stopped

thread #1, name = 'a.out', stop reason = step in frame #0: 0x0000000000401120 a.out`fn2(a=, b=, c=) at call_site.c:11:10

dwblaikie commented 4 years ago

Hmm, yeah, gdb does seem to understand some uses of call_value/entry_value - so I probably messed something up in my example.

dwblaikie commented 4 years ago

hand-crafted DWARF assembly for proposed solution

David, so we should produce something like this?

I believe that would be correct... though it's possible I'm wrong in some way.

What would be the best place in llvm sources to create hack that emits this?

That I don't know - I haven't looked at much variable location code, or the call_site stuff. You could probably hand-write this DWARF by getting LLVM's assembly and tweaking the location descriptions in the call_values

0x000000b0: DW_TAG_subprogram DW_AT_name ("main")

0x000000bf: DW_TAG_call_site

0x000000c5: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg1 RDX) DW_AT_call_value (DW_OP_constu 0x22)

0x000000cb: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_lit23)

0x000000d0: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_call_value (DW_OP_lit14)

0x00000066: DW_TAG_subprogram DW_AT_name ("fn2")

0x00000076: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg5 RDI) DW_AT_name ("a")

0x00000081: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg4 RSI)) DW_AT_name ("b")

0x0000008c: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg1 RDX)) DW_AT_name ("c")

Yeah - well, I think it needs a DW_OP_stack_value on the locations, but otherwise that's what I've prototyped in the attached patch. GDB still seems to think these are "optimized out" but maybe it doesn't understand the DWARFv5 entry_value forms? I thought it did, though... so I might've broken the DWARF in some way.

dwblaikie commented 4 years ago

Yes, the problem seems to be where lldb stops when stepping into the function. With the tail call stepping into fn2 shows this output for "di", which already populated the stack:

(lldb) di a.out`fn2: 0x401120 <+0>: subq $0x28, %rsp 0x401124 <+4>: movq %rdi, 0x8(%rsp) 0x401129 <+9>: movq %rsi, 0x20(%rsp) 0x40112e <+14>: movq %rdx, 0x18(%rsp) -> 0x401133 <+19>: movq 0x8(%rsp), %rax

Without the tail call, the stack is not yet populated, leading to incorrect values during debugging as the debug info refers to the stack values:

(lldb) di a.out`fn2: -> 0x401120 <+0>: subq $0x28, %rsp 0x401124 <+4>: movq %rdi, 0x8(%rsp) 0x401129 <+9>: movq %rsi, 0x20(%rsp) 0x40112e <+14>: movq %rdx, 0x18(%rsp) 0x401133 <+19>: movq 0x8(%rsp), %rax

yeah, guess gdb has the same issue - be nice to fix at least lldb, but at least that's an orthogonal issue to worry about elsewhere, perhaps.

Working up an example with hand-crafted assembly for the possible fix to the original issue in this bug.

tbosch commented 4 years ago

Yes, the problem seems to be where lldb stops when stepping into the function. With the tail call stepping into fn2 shows this output for "di", which already populated the stack:

(lldb) di a.out`fn2: 0x401120 <+0>: subq $0x28, %rsp 0x401124 <+4>: movq %rdi, 0x8(%rsp) 0x401129 <+9>: movq %rsi, 0x20(%rsp) 0x40112e <+14>: movq %rdx, 0x18(%rsp) -> 0x401133 <+19>: movq 0x8(%rsp), %rax

Without the tail call, the stack is not yet populated, leading to incorrect values during debugging as the debug info refers to the stack values:

(lldb) di a.out`fn2: -> 0x401120 <+0>: subq $0x28, %rsp 0x401124 <+4>: movq %rdi, 0x8(%rsp) 0x401129 <+9>: movq %rsi, 0x20(%rsp) 0x40112e <+14>: movq %rdx, 0x18(%rsp) 0x401133 <+19>: movq 0x8(%rsp), %rax

tbosch commented 4 years ago

I used lldb, not gdb here.

This seems to be somehow related to a tail call:

int main(void) { return fn2(14, 23, 34) + 1; }

This will print the parameters correctly on entry.

dwblaikie commented 4 years ago

Ah, good point. However, when I add attribute((optnone)) to fn2 and run lldb, I am getting invalid values for the parameters:

frame #0: 0x0000000000401120 a.out`fn2(a=0, b=4198752, c=4295491584) at test.cc:10 7 8 attribute((optnone)) attribute((noinline)) long int 9 fn2 (long int a, long int b, long int c) -> 10 { 11 long int q = 2 * a; 12 return fn1 (5, 6, 7); 13 }

Looking at the DWARF of fn2, I see the following DW_TAG_formal_parameters now:

0x00000066: DW_TAG_subprogram DW_AT_name ("fn2")

0x00000076: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +8) DW_AT_name ("a")

0x00000081: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +32) DW_AT_name ("b")

0x0000008c: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +24) DW_AT_name ("c")

Referencing the stack here is probably expected for non optimized code, but not in this case as main is optimized and doesn't use the stack to push arguments.

Hmm - I don't see the problematic behavior in gdb... oh, I guess I do:

fn2 (a=72704, b=32767, c=-10488) at call_site.c:10 10 int fn2(int a, int b, int c) {

I think that's probably a different bug - if you step into the function and print the parameters, they are correct. gdb might be breaking too early, before the values have been pulled from the ABI registers and put into the local stack (nothing to do with the caller putting things on the stack - the caller's still meant to justn put them in the ABI registers as it does correctly with optnone on the function, preventing the interprocedural dead arg behavior)

Maybe the prologue isn't being marked correctly, or marked too eraly, or gdb is ignoring it in some way - such that it stops before the stack's been populated.

Hmm, prologue seems to be in the right place.

"b fn2 r"

Does get you to a place where the parameters are printed correctly.

Wonder what logic is causing "step" to stop earlier than the usual function breakpoint?

tbosch commented 4 years ago

David, so we should produce something like this? What would be the best place in llvm sources to create hack that emits this?

0x000000b0: DW_TAG_subprogram DW_AT_name ("main")

0x000000bf: DW_TAG_call_site

0x000000c5: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg1 RDX) DW_AT_call_value (DW_OP_constu 0x22)

0x000000cb: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_lit23)

0x000000d0: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_call_value (DW_OP_lit14)

0x00000066: DW_TAG_subprogram DW_AT_name ("fn2")

0x00000076: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg5 RDI) DW_AT_name ("a")

0x00000081: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg4 RSI)) DW_AT_name ("b")

0x0000008c: DW_TAG_formal_parameter DW_AT_location (DW_OP_entry_value(DW_OP_reg1 RDX)) DW_AT_name ("c")

dwblaikie commented 4 years ago

Well, I guess LLVM only does this (or, evidence currently doesn't show otherwise) when it does the interprocedural analysis - in which case incorrect call_value is never referenced, so I maybe can't get the debugger to misbehave in this case.

But we should look into what codepath even causes that DWARF to be created - it seems dangerous/problematic that that can be produced, because it's quite wrong, if benignly so.

And to fix this bug, yeah - we should produce the same call_values as we would if the interprocedural analysis hadn't happened - lit values.

The register is never populated nor referenced, but I think that's OK - it's just used as the mapping between call_value and OP_entry_value I /think/.

tbosch commented 4 years ago

Ah, good point. However, when I add attribute((optnone)) to fn2 and run lldb, I am getting invalid values for the parameters:

frame #0: 0x0000000000401120 a.out`fn2(a=0, b=4198752, c=4295491584) at test.cc:10 7 8 attribute((optnone)) attribute((noinline)) long int 9 fn2 (long int a, long int b, long int c) -> 10 { 11 long int q = 2 * a; 12 return fn1 (5, 6, 7); 13 }

Looking at the DWARF of fn2, I see the following DW_TAG_formal_parameters now:

0x00000066: DW_TAG_subprogram DW_AT_name ("fn2")

0x00000076: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +8) DW_AT_name ("a")

0x00000081: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +32) DW_AT_name ("b")

0x0000008c: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +24) DW_AT_name ("c")

Referencing the stack here is probably expected for non optimized code, but not in this case as main is optimized and doesn't use the stack to push arguments.

dwblaikie commented 4 years ago

The "main" function does contain a DW_TAG_call_site entry, but the DW_AT_call_value refers to registers that are not filled by the assembly.

The DW_TAG_formal_parameter of "fn2" don't contain a DW_AT_Location attribute.

David, what would the correct DW_AT_Location attribute look like in this case? Is there a special expression that refers to the call site values?

Here is the debug info for the "main" function

0x000000ad: DW_TAG_subprogram DW_AT_name ("main")

0x000000bc: DW_TAG_call_site DW_AT_call_origin (0x00000066) DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000401140)

0x000000c2: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_call_value (DW_OP_entry_value(DW_OP_reg5 RDI))

0x000000c9: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_entry_value(DW_OP_reg4 RSI))

0x000000d0: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg1 RDX) DW_AT_call_value (DW_OP_entry_value(DW_OP_reg1 RDX))

If you add attribute((optnone)) to fn2, then main doesn't lose those 3 register stores and it gets this DWARF:

0x000000a9: DW_TAG_subprogram DW_AT_name ("main")

0x000000b8: DW_TAG_call_site DW_AT_call_origin (0x00000066) DW_AT_call_tail_call (true) DW_AT_call_pc (0x000000000000005f)

0x000000be: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg1 RDX) DW_AT_call_value (DW_OP_constu 0x22)

0x000000c4: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_lit23)

0x000000c9: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_call_value (DW_OP_lit14)

I think that DWARF could be emitted even in the case where the register stores are optimized away.

Actually, I'd go further and say the current DWARF (DW_OP_entry_value(RDI/RSI/RDX)) is incorrect. That could lead to quite incorrect conclusions about those 3 registers - if this wasn't main, but some other caller that did have call_values for those registers, that'd have nothing to do with the 14/23/34 constants.

tries to construct a test case where this gives incorrect answers

tbosch commented 4 years ago

The "main" function does contain a DW_TAG_call_site entry, but the DW_AT_call_value refers to registers that are not filled by the assembly.

The DW_TAG_formal_parameter of "fn2" don't contain a DW_AT_Location attribute.

David, what would the correct DW_AT_Location attribute look like in this case? Is there a special expression that refers to the call site values?

Here is the debug info for the "main" function

0x000000ad: DW_TAG_subprogram DW_AT_name ("main")

0x000000bc: DW_TAG_call_site DW_AT_call_origin (0x00000066) DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000401140)

0x000000c2: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_call_value (DW_OP_entry_value(DW_OP_reg5 RDI))

0x000000c9: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_call_value (DW_OP_entry_value(DW_OP_reg4 RSI))

0x000000d0: DW_TAG_call_site_parameter DW_AT_location (DW_OP_reg1 RDX) DW_AT_call_value (DW_OP_entry_value(DW_OP_reg1 RDX))

tbosch commented 4 years ago

Compiling using gcc -O3 (instead of -O2) also makes gcc to remove the unused parameter values form the call site (see https://godbolt.org/z/d3jzqo).

Interestingly, in this case, gcc also shows the parameters as "optimized out" when debugging with gdb:

fn2 (c=, b=, a=) at test.cc:12 12 return fn1 (5, 6, 7);

dwblaikie commented 4 years ago

Looks like a case where call_site_value could be used & isn't being?

tbosch commented 4 years ago

I found the following difference between gcc and clang while analyzing the example:

gcc puts all parameter values for fn2 into registers, even though fn2 is not using them. Assembly from gcc -g -O2 (see https://godbolt.org/z/coK95P)

fn1(long, long, long): imul esi, edx lea eax, [rsi+rdi*2] ret fn2(long, long, long): mov edx, 7 mov esi, 6 mov edi, 5 call fn1(long, long, long) cdqe ret main: mov edx, 34 mov esi, 23 mov edi, 14 call fn2(long, long, long) add eax, 1 ret

However, clang is a bit smarter here (see https://godbolt.org/z/cEG9vY)

fn1(long, long, long): # @fn1(long, long, long) imul edx, esi lea eax, [rdx + 2*rdi] ret fn2(long, long, long): # @fn2(long, long, long) push rax mov edi, 5 mov esi, 6 mov edx, 7 call fn1(long, long, long) cdqe pop rcx ret main: # @main push rax call fn2(long, long, long) add eax, 1 pop rcx ret

Looking at the DWARF5 debug info, we can see that gcc is using these registers in the DW_TAG_format_parameters too:

0x00000071: DW_TAG_subprogram DW_AT_name ("fn2")

0x00000097: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_location (0x00000004: [0x0000000000001160, 0x000000000000116f): DW_OP_reg5 RDI [0x000000000000116f, 0x0000000000001177): DW_OP_GNU_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)

0x000000a9: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_location (0x00000041: [0x0000000000001160, 0x000000000000116a): DW_OP_reg4 RSI [0x000000000000116a, 0x0000000000001177): DW_OP_GNU_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value)

0x000000bb: DW_TAG_formal_parameter DW_AT_name ("c") DW_AT_location (0x0000007e: [0x0000000000001160, 0x0000000000001165): DW_OP_reg1 RDX [0x0000000000001165, 0x0000000000001177): DW_OP_GNU_entry_value(DW_OP_reg1 RDX), DW_OP_stack_value)

Looking at the clang DWARF5 debug info, these registers are missing:

0x00000066: DW_TAG_subprogram DW_AT_name ("fn2")

0x00000076: DW_TAG_formal_parameter DW_AT_name ("a")

0x0000007e: DW_TAG_formal_parameter DW_AT_name ("b")

0x00000086: DW_TAG_formal_parameter DW_AT_name ("c")

tbosch commented 4 years ago

As the last comment here is 6 months old: This bug still reproduces at HEAD (1d9b860fb6a85df33fd52fcacc6a5efb421621bd)

llvmbot commented 5 years ago

Hi Tom,

I think the DAE pass does the right thing in this case.

This case is typical example for using the debug entry values feature. There is already a portion of the feature implemented in LLVM, but it is controlled by an CC1 option "-femit-debug-entry-values".

There are two parts of the implementation actually: 1) call_site and call_site_parameters debugging info (please see the collectCallSiteParameters() from the DwarfDebug)

2) Production of DBG_VALUEs implemented within the LiveDebugValues. Currently we are changing the approach of the implementation for this part, but it is almost done. Please see the https://reviews.llvm.org/D68209.

The plan is to enable the feature by default, and get rid of the experimental option.

Best regards, Djordje

llvmbot commented 5 years ago

I'm unassigning from myself, I feel there's not much can be done right now in regards to DeadArgElimination's role in this bug.

llvmbot commented 5 years ago

On further reflection, I would surmise that the current behaviour is correct.

If we know that there's the potential for "undeffed" arguments to be passed into the function, we should not see those values. If we're passing arguments from an external CU then those args will never be undeffed and so we should be able to see them.

In the case of when we have both types of calls (internal and external) then the fact that there's the potential to see undeffed arguments when debugging should take priority over seeing the external arguments.

If there was some way to see the call site arguments and make a decision based on what is passed or whether it's a different CU can be made, then perhaps a debugger could recover the argument values from the caller (hence DW_OP_entry_value discussion).

I've cc'd DJTODORO into the discussion has he made a change in the DEADARGELIM area in September of 2018.

llvmbot commented 5 years ago

The Behaviour of DAE in this case is to undef the arguments passed to fn2 via the caller site. Once it also proceeds to undef any 'metadata' references to the argument too.

However, there is a problem. The function ABI has not changed. Any function that has external linkage (aka not static) that is called from the CU will have it's debug intrinsics undeffed.

External CU callers to the function will not have had the same treatment and will still be passing arguments through.

So we have three cases. If the function is called from the same CU and has dead args, the debug intrinsics are undeffed. If the function is called from an external CU and not the same CU, then the debug data is left as is and finally, if the function is called internally from the same CU and externally, then the debug data is undeffed.

Now, for all intents and purposes the arguments are dead. They're unused. I don't see why we should be generating debug data for them? Is there a use case for wanting to see an argument value that is never used in a function?

If we want to preserve debug info for a dead arg in a function that has the potential to be called externally, then anything that has the potential to be externally linked with should be left alone.

Or

The args are dead and we should drop the debug info irregardless of an internal or external caller.

llvmbot commented 5 years ago

Taking for myself.

dwblaikie commented 5 years ago

Running with llvm/clang r354569, "clang input.c -o a.out -g -O2 -mllvm -print-after-all", it looks like all the dbg.values for fn2's arguments are set to 'undef' after the "Dead Argument Elimination" pass. This doesn't happen if I run "opt -O2" on the extracted IR for 'fn2' though.

My understanding is that while eliminating dead arguments is good, we can still keep the debugging information around so that there's a location when we first enter the function, even if the argument is immediately discarded.

DAE actually removes arguments from functions (changes their signature) when possible (internal functions/all callers are known), so there would be no debug location to describe even at entry.

But perhaps this is a different mode for DAE (perhaps it does something even when the signature cannot be changed?) that you're looking at?

jmorse commented 5 years ago

Running with llvm/clang r354569, "clang input.c -o a.out -g -O2 -mllvm -print-after-all", it looks like all the dbg.values for fn2's arguments are set to 'undef' after the "Dead Argument Elimination" pass. This doesn't happen if I run "opt -O2" on the extracted IR for 'fn2' though.

My understanding is that while eliminating dead arguments is good, we can still keep the debugging information around so that there's a location when we first enter the function, even if the argument is immediately discarded.

dwblaikie commented 6 years ago

While the initial measurements for the call_site attributes implemented so far were that they didn't increase the size of the DWARF by a substantial amount, I'd be similarly concerned that adding attributes for every parameter (not sure exactly what heuristics GCC uses to decide which attributes to annotate - guess it might do something a bit more subtle than "every parameter at every call") might be problematically large.

dwblaikie commented 6 years ago

Might be able to use DW_OP_entry_value here if we have an abi location that the value was passed in (not much we can do if we end up removing the argument during LTO though).

I think DW_OP_GNU_parameter_ref is what can be used here (and my understanding is that this is what GCC does).

Looks like parameter_ref is for cases where the function ABI has been modified & there are sort of 'ghost' parameters (parameters that are no longer passed at the machine code level becaues they're unused or synthesized on the other side): https://gcc.gnu.org/ml/gcc-patches/2011-06/msg00649.html

Looking at the DWARF for this sample when compiled with GCC, it looks like it's using the DW_TAG_GNU_call_site and DW_TAG_GNU_call_site_parameter tags to describe the values of these variables at the call site in the calling frame:

0x0000004b: DW_TAG_GNU_call_site DW_AT_low_pc (0x0000000000401025) DW_AT_abstract_origin (0x0000013e "_Z3fn2lll")

0x00000058: DW_TAG_GNU_call_site_parameter DW_AT_abstract_origin (0x0000008c "a") DW_AT_GNU_call_site_value (DW_OP_lit14)

0x0000005f: DW_TAG_GNU_call_site_parameter DW_AT_abstract_origin (0x00000096 "b") DW_AT_GNU_call_site_value (DW_OP_lit23)

0x00000066: DW_TAG_GNU_call_site_parameter DW_AT_abstract_origin (0x000000a0 "c") DW_AT_GNU_call_site_value (DW_OP_const1u 0x22)

There's been some work to implement the now standardized (in DWARFv5) call_site attributes in Clang/LLVM, though I don't think the parameter attributes have been implemented (either when compiling for DWARFv5 or when targetting LLDB (as an extension))

llvmbot commented 6 years ago

Might be able to use DW_OP_entry_value here if we have an abi location that the value was passed in (not much we can do if we end up removing the argument during LTO though).

I think DW_OP_GNU_parameter_ref is what can be used here (and my understanding is that this is what GCC does).

echristo commented 6 years ago

Might be able to use DW_OP_entry_value here if we have an abi location that the value was passed in (not much we can do if we end up removing the argument during LTO though).

llvmbot commented 6 years ago

assigned to @tbosch

llvm / llvm-project

[DebugInfo] Unused arguments aren't visible in optimized code. #39064

Extended Description

Machine code for function main: NoPHIs, TracksLiveness, NoVRegs, TiedOpsRewritten

End machine code for function main.