Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[DebugInfo] Unused arguments aren't visible in optimized code. #38689

Open Quuxplusone opened 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR39716
Status CONFIRMED
Importance P enhancement
Reported by Davide Italiano (ditaliano@apple.com)
Reported on 2018-11-19 16:36:35 -0800
Last modified on 2020-08-04 00:07:01 -0700
Version trunk
Hardware PC All
CC anton@korobeynikov.info, aprantl@apple.com, asowda@cisco.com, chackz0x12@gmail.com, david.stenberg@ericsson.com, dblaikie@gmail.com, djordje.todorovic@rt-rk.com, djordje.todorovic@syrmia.com, echristo@gmail.com, friss@apple.com, jdevlieghere@apple.com, jeremy.morse.llvm@gmail.com, keith.walker@arm.com, llvm-bugs@lists.llvm.org, paul_robinson@playstation.sony.com, rnk@google.com, tbosch1009@gmail.com, vk@vedantk.com, vsk@apple.com, zturner@google.com
Fixed by commit(s)
Attachments call_site.s (25767 bytes, text/plain)
Blocks PR38768
Blocked by
See also PR45406
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x0000000100000f84 a.out`fn2(a=<unavailable>, b=<unavailable>, c=<unavailable>) at optout.c:12:10 [opt]
   9    fn2 (long int a, long int b, long int c)
   10   {
   11     long int q = 2 * a;
-> 12     return fn1 (5, 6, 7);
   13   }
   14
   15   int main(void) {
(lldb) frame var
(long) a = <no location, value may have been optimized out>

(long) b = <no location, value may have been optimized out>

(long) c = <no location, value may have been optimized out>

(long) q = <no location, value may have been optimized out>

A gcc-built binary debugged with gdb yields:

Breakpoint 1, main () at opt.c:16
16                      return fn2(14, 23, 34);
(gdb) s
fn2 (c=34, b=23, a=14) at opt.c:12
12                  return fn1 (5, 6, 7);
(gdb) info args
c = 34
b = 23
a = 14

Testcase:

__attribute__((noinline))
int fn1 (long int x, long int y, long int z) {
        int l = x * 2;
        int q = y * z;
        return l + q;
}

__attribute__((noinline)) long int
fn2 (long int a, long int b, long int c)
{
  long int q = 2 * a;
  return fn1 (5, 6, 7);
}

int main(void) {
        return fn2(14, 23, 34);
}
Quuxplusone commented 5 years ago

Might be able to use DW_OP_entry_value here if we have an abi location that the value was passed in (not much we can do if we end up removing the argument during LTO though).

Quuxplusone commented 5 years ago
(In reply to Eric Christopher from comment #1)
> Might be able to use DW_OP_entry_value here if we have an abi location that
> the value was passed in (not much we can do if we end up removing the
> argument during LTO though).

I think DW_OP_GNU_parameter_ref is what can be used here (and my understanding
is that this is what GCC does).
Quuxplusone commented 5 years ago
(In reply to Davide Italiano from comment #2)
> (In reply to Eric Christopher from comment #1)
> > Might be able to use DW_OP_entry_value here if we have an abi location that
> > the value was passed in (not much we can do if we end up removing the
> > argument during LTO though).
>
> I think DW_OP_GNU_parameter_ref is what can be used here (and my
> understanding is that this is what GCC does).

Looks like parameter_ref is for cases where the function ABI has been modified
& there are sort of 'ghost' parameters (parameters that are no longer passed at
the machine code level becaues they're unused or synthesized on the other
side): https://gcc.gnu.org/ml/gcc-patches/2011-06/msg00649.html

Looking at the DWARF for this sample when compiled with GCC, it looks like it's
using the DW_TAG_GNU_call_site and DW_TAG_GNU_call_site_parameter tags to
describe the values of these variables at the call site in the calling frame:

0x0000004b:     DW_TAG_GNU_call_site
                  DW_AT_low_pc  (0x0000000000401025)
                  DW_AT_abstract_origin (0x0000013e "_Z3fn2lll")

0x00000058:       DW_TAG_GNU_call_site_parameter
                    DW_AT_abstract_origin       (0x0000008c "a")
                    DW_AT_GNU_call_site_value   (DW_OP_lit14)

0x0000005f:       DW_TAG_GNU_call_site_parameter
                    DW_AT_abstract_origin       (0x00000096 "b")
                    DW_AT_GNU_call_site_value   (DW_OP_lit23)

0x00000066:       DW_TAG_GNU_call_site_parameter
                    DW_AT_abstract_origin       (0x000000a0 "c")
                    DW_AT_GNU_call_site_value   (DW_OP_const1u 0x22)

There's been some work to implement the now standardized (in DWARFv5) call_site
attributes in Clang/LLVM, though I don't think the parameter attributes have
been implemented (either when compiling for DWARFv5 or when targetting LLDB (as
an extension))
Quuxplusone commented 5 years ago

While the initial measurements for the call_site attributes implemented so far were that they didn't increase the size of the DWARF by a substantial amount, I'd be similarly concerned that adding attributes for every parameter (not sure exactly what heuristics GCC uses to decide which attributes to annotate - guess it might do something a bit more subtle than "every parameter at every call") might be problematically large.

Quuxplusone commented 5 years ago

Running with llvm/clang r354569, "clang input.c -o a.out -g -O2 -mllvm -print-after-all", it looks like all the dbg.values for fn2's arguments are set to 'undef' after the "Dead Argument Elimination" pass. This doesn't happen if I run "opt -O2" on the extracted IR for 'fn2' though.

My understanding is that while eliminating dead arguments is good, we can still keep the debugging information around so that there's a location when we first enter the function, even if the argument is immediately discarded.

Quuxplusone commented 5 years ago
(In reply to Jeremy Morse from comment #5)
> Running with llvm/clang r354569, "clang input.c -o a.out -g -O2 -mllvm
> -print-after-all", it looks like all the dbg.values for fn2's arguments are
> set to 'undef' after the "Dead Argument Elimination" pass. This doesn't
> happen if I run "opt -O2" on the extracted IR for 'fn2' though.
>
> My understanding is that while eliminating dead arguments is good, we can
> still keep the debugging information around so that there's a location when
> we first enter the function, even if the argument is immediately discarded.

DAE actually removes arguments from functions (changes their signature) when
possible (internal functions/all callers are known), so there would be no debug
location to describe even at entry.

But perhaps this is a different mode for DAE (perhaps it does something even
when the signature cannot be changed?) that you're looking at?
Quuxplusone commented 4 years ago

Taking for myself.

Quuxplusone commented 4 years ago

The Behaviour of DAE in this case is to undef the arguments passed to fn2 via the caller site. Once it also proceeds to undef any 'metadata' references to the argument too.

However, there is a problem. The function ABI has not changed. Any function that has external linkage (aka not static) that is called from the CU will have it's debug intrinsics undeffed.

External CU callers to the function will not have had the same treatment and will still be passing arguments through.

So we have three cases. If the function is called from the same CU and has dead args, the debug intrinsics are undeffed. If the function is called from an external CU and not the same CU, then the debug data is left as is and finally, if the function is called internally from the same CU and externally, then the debug data is undeffed.

Now, for all intents and purposes the arguments are dead. They're unused. I don't see why we should be generating debug data for them? Is there a use case for wanting to see an argument value that is never used in a function?

If we want to preserve debug info for a dead arg in a function that has the potential to be called externally, then anything that has the potential to be externally linked with should be left alone.

Or

The args are dead and we should drop the debug info irregardless of an internal or external caller.

Quuxplusone commented 4 years ago

On further reflection, I would surmise that the current behaviour is correct.

If we know that there's the potential for "undeffed" arguments to be passed into the function, we should not see those values. If we're passing arguments from an external CU then those args will never be undeffed and so we should be able to see them.

In the case of when we have both types of calls (internal and external) then the fact that there's the potential to see undeffed arguments when debugging should take priority over seeing the external arguments.

If there was some way to see the call site arguments and make a decision based on what is passed or whether it's a different CU can be made, then perhaps a debugger could recover the argument values from the caller (hence DW_OP_entry_value discussion).

I've cc'd DJTODORO into the discussion has he made a change in the DEADARGELIM area in September of 2018.

Quuxplusone commented 4 years ago

I'm unassigning from myself, I feel there's not much can be done right now in regards to DeadArgElimination's role in this bug.

Quuxplusone commented 4 years ago
Hi Tom,

I think the DAE pass does the right thing in this case.

This case is typical example for using the debug entry values feature. There is
already a portion of the feature implemented in LLVM, but it is controlled by
an CC1 option "-femit-debug-entry-values".

There are two parts of the implementation actually:
  1) call_site and call_site_parameters debugging info (please see the collectCallSiteParameters() from the DwarfDebug)

  2) Production of DBG_VALUEs implemented within the LiveDebugValues. Currently we are changing the approach of the implementation for this part, but it is almost done. Please see the https://reviews.llvm.org/D68209.

The plan is to enable the feature by default, and get rid of the experimental
option.

Best regards,
Djordje
Quuxplusone commented 4 years ago

As the last comment here is 6 months old: This bug still reproduces at HEAD (1d9b860fb6a85df33fd52fcacc6a5efb421621bd)

Quuxplusone commented 4 years ago
I found the following difference between gcc and clang while analyzing the
example:

gcc puts all parameter values for fn2 into registers, even though fn2 is not
using them.
Assembly from gcc -g -O2 (see https://godbolt.org/z/coK95P)

fn1(long, long, long):
  imul esi, edx
  lea eax, [rsi+rdi*2]
  ret
fn2(long, long, long):
  mov edx, 7
  mov esi, 6
  mov edi, 5
  call fn1(long, long, long)
  cdqe
  ret
main:
  mov edx, 34
  mov esi, 23
  mov edi, 14
  call fn2(long, long, long)
  add eax, 1
  ret

However, clang is a bit smarter here (see https://godbolt.org/z/cEG9vY)

fn1(long, long, long): # @fn1(long, long, long)
  imul edx, esi
  lea eax, [rdx + 2*rdi]
  ret
fn2(long, long, long): # @fn2(long, long, long)
  push rax
  mov edi, 5
  mov esi, 6
  mov edx, 7
  call fn1(long, long, long)
  cdqe
  pop rcx
  ret
main: # @main
  push rax
  call fn2(long, long, long)
  add eax, 1
  pop rcx
  ret

Looking at the DWARF5 debug info, we can see that gcc is using these registers
in the DW_TAG_format_parameters too:

0x00000071:   DW_TAG_subprogram
                DW_AT_name  ("fn2")

0x00000097:     DW_TAG_formal_parameter
                  DW_AT_name    ("a")
                  DW_AT_location    (0x00000004:
                     [0x0000000000001160, 0x000000000000116f): DW_OP_reg5 RDI
                     [0x000000000000116f, 0x0000000000001177): DW_OP_GNU_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)

0x000000a9:     DW_TAG_formal_parameter
                  DW_AT_name    ("b")
                  DW_AT_location    (0x00000041:
                     [0x0000000000001160, 0x000000000000116a): DW_OP_reg4 RSI
                     [0x000000000000116a, 0x0000000000001177): DW_OP_GNU_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value)

0x000000bb:     DW_TAG_formal_parameter
                  DW_AT_name    ("c")
                  DW_AT_location    (0x0000007e:
                     [0x0000000000001160, 0x0000000000001165): DW_OP_reg1 RDX
                     [0x0000000000001165, 0x0000000000001177): DW_OP_GNU_entry_value(DW_OP_reg1 RDX), DW_OP_stack_value)

Looking at the clang DWARF5 debug info, these registers are missing:

0x00000066:   DW_TAG_subprogram
                DW_AT_name  ("fn2")

0x00000076:     DW_TAG_formal_parameter
                  DW_AT_name    ("a")

0x0000007e:     DW_TAG_formal_parameter
                  DW_AT_name    ("b")

0x00000086:     DW_TAG_formal_parameter
                  DW_AT_name    ("c")
Quuxplusone commented 4 years ago

Looks like a case where call_site_value could be used & isn't being?

Quuxplusone commented 4 years ago
Compiling using gcc -O3 (instead of -O2) also makes gcc to remove the unused
parameter values form the call site (see https://godbolt.org/z/d3jzqo).

Interestingly, in this case, gcc also shows the parameters as "optimized out"
when debugging with gdb:

fn2 (c=<optimized out>, b=<optimized out>, a=<optimized out>) at test.cc:12
12        return fn1 (5, 6, 7);
Quuxplusone commented 4 years ago
The "main" function does contain a DW_TAG_call_site entry, but the
DW_AT_call_value refers to registers that are not filled by the assembly.

The DW_TAG_formal_parameter of "fn2" don't contain a DW_AT_Location attribute.

David, what would the correct DW_AT_Location attribute look like in this case?
Is there a special expression that refers to the call site values?

Here is the debug info for the "main" function

0x000000ad:   DW_TAG_subprogram
                DW_AT_name  ("main")

0x000000bc:     DW_TAG_call_site
                  DW_AT_call_origin (0x00000066)
                  DW_AT_call_tail_call  (true)
                  DW_AT_call_pc (0x0000000000401140)

0x000000c2:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg5 RDI)
                    DW_AT_call_value    (DW_OP_entry_value(DW_OP_reg5 RDI))

0x000000c9:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg4 RSI)
                    DW_AT_call_value    (DW_OP_entry_value(DW_OP_reg4 RSI))

0x000000d0:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg1 RDX)
                    DW_AT_call_value    (DW_OP_entry_value(DW_OP_reg1 RDX))
Quuxplusone commented 4 years ago
(In reply to Tobias Bosch from comment #16)
> The "main" function does contain a DW_TAG_call_site entry, but the
> DW_AT_call_value refers to registers that are not filled by the assembly.
>
> The DW_TAG_formal_parameter of "fn2" don't contain a DW_AT_Location
> attribute.
>
> David, what would the correct DW_AT_Location attribute look like in this
> case? Is there a special expression that refers to the call site values?
>
> Here is the debug info for the "main" function
>
> 0x000000ad:   DW_TAG_subprogram
>                 DW_AT_name    ("main")
>
> 0x000000bc:     DW_TAG_call_site
>                   DW_AT_call_origin   (0x00000066)
>                   DW_AT_call_tail_call    (true)
>                   DW_AT_call_pc   (0x0000000000401140)
>
> 0x000000c2:       DW_TAG_call_site_parameter
>                     DW_AT_location    (DW_OP_reg5 RDI)
>                     DW_AT_call_value  (DW_OP_entry_value(DW_OP_reg5 RDI))
>
> 0x000000c9:       DW_TAG_call_site_parameter
>                     DW_AT_location    (DW_OP_reg4 RSI)
>                     DW_AT_call_value  (DW_OP_entry_value(DW_OP_reg4 RSI))
>
> 0x000000d0:       DW_TAG_call_site_parameter
>                     DW_AT_location    (DW_OP_reg1 RDX)
>                     DW_AT_call_value  (DW_OP_entry_value(DW_OP_reg1 RDX))

If you add __attribute__((optnone)) to fn2, then main doesn't lose those 3
register stores and it gets this DWARF:

0x000000a9:   DW_TAG_subprogram
                DW_AT_name      ("main")

0x000000b8:     DW_TAG_call_site
                  DW_AT_call_origin     (0x00000066)
                  DW_AT_call_tail_call  (true)
                  DW_AT_call_pc (0x000000000000005f)

0x000000be:       DW_TAG_call_site_parameter
                    DW_AT_location      (DW_OP_reg1 RDX)
                    DW_AT_call_value    (DW_OP_constu 0x22)

0x000000c4:       DW_TAG_call_site_parameter
                    DW_AT_location      (DW_OP_reg4 RSI)
                    DW_AT_call_value    (DW_OP_lit23)

0x000000c9:       DW_TAG_call_site_parameter
                    DW_AT_location      (DW_OP_reg5 RDI)
                    DW_AT_call_value    (DW_OP_lit14)

I think that DWARF could be emitted even in the case where the register stores
are optimized away.

Actually, I'd go further and say the current DWARF
(DW_OP_entry_value(RDI/RSI/RDX)) is incorrect. That could lead to quite
incorrect conclusions about those 3 registers - if this wasn't main, but some
other caller that did have call_values for those registers, that'd have nothing
to do with the 14/23/34 constants.

*tries to construct a test case where this gives incorrect answers*
Quuxplusone commented 4 years ago
Ah, good point. However, when I add __attribute__((optnone)) to fn2 and run
lldb, I am getting invalid values for the parameters:

frame #0: 0x0000000000401120 a.out`fn2(a=0, b=4198752, c=4295491584) at
test.cc:10
   7
   8    __attribute__((optnone)) __attribute__((noinline)) long int
   9    fn2 (long int a, long int b, long int c)
-> 10   {
   11       long int q = 2 * a;
   12         return fn1 (5, 6, 7);
   13   }

Looking at the DWARF of fn2, I see the following DW_TAG_formal_parameters now:

0x00000066:   DW_TAG_subprogram
                DW_AT_name  ("fn2")

0x00000076:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_fbreg +8)
                  DW_AT_name    ("a")

0x00000081:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_fbreg +32)
                  DW_AT_name    ("b")

0x0000008c:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_fbreg +24)
                  DW_AT_name    ("c")

Referencing the stack here is probably expected for non optimized code, but not
in this case as main is optimized and doesn't use the stack to push arguments.
Quuxplusone commented 4 years ago

Well, I guess LLVM only does this (or, evidence currently doesn't show otherwise) when it does the interprocedural analysis - in which case incorrect call_value is never referenced, so I maybe can't get the debugger to misbehave in this case.

But we should look into what codepath even causes that DWARF to be created - it seems dangerous/problematic that that can be produced, because it's quite wrong, if benignly so.

And to fix this bug, yeah - we should produce the same call_values as we would if the interprocedural analysis hadn't happened - lit values.

The register is never populated nor referenced, but I think that's OK - it's just used as the mapping between call_value and OP_entry_value I /think/.

Quuxplusone commented 4 years ago
David, so we should produce something like this? What would be the best place
in llvm sources to create hack that emits this?

0x000000b0:   DW_TAG_subprogram
                DW_AT_name  ("main")

0x000000bf:     DW_TAG_call_site

0x000000c5:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg1 RDX)
                    DW_AT_call_value    (DW_OP_constu 0x22)

0x000000cb:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg4 RSI)
                    DW_AT_call_value    (DW_OP_lit23)

0x000000d0:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg5 RDI)
                    DW_AT_call_value    (DW_OP_lit14)

0x00000066:   DW_TAG_subprogram
                DW_AT_name  ("fn2")

0x00000076:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg5 RDI)
                  DW_AT_name    ("a")

0x00000081:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg4 RSI))
                  DW_AT_name    ("b")

0x0000008c:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg1 RDX))
                  DW_AT_name    ("c")
Quuxplusone commented 4 years ago
(In reply to Tobias Bosch from comment #18)
> Ah, good point. However, when I add __attribute__((optnone)) to fn2 and run
> lldb, I am getting invalid values for the parameters:
>
> frame #0: 0x0000000000401120 a.out`fn2(a=0, b=4198752, c=4295491584) at
> test.cc:10
>    7
>    8      __attribute__((optnone)) __attribute__((noinline)) long int
>    9      fn2 (long int a, long int b, long int c)
> -> 10     {
>    11         long int q = 2 * a;
>    12           return fn1 (5, 6, 7);
>    13     }
>
> Looking at the DWARF of fn2, I see the following DW_TAG_formal_parameters
> now:
>
> 0x00000066:   DW_TAG_subprogram
>                 DW_AT_name    ("fn2")
>
> 0x00000076:     DW_TAG_formal_parameter
>                   DW_AT_location  (DW_OP_fbreg +8)
>                   DW_AT_name  ("a")
>
> 0x00000081:     DW_TAG_formal_parameter
>                   DW_AT_location  (DW_OP_fbreg +32)
>                   DW_AT_name  ("b")
>
> 0x0000008c:     DW_TAG_formal_parameter
>                   DW_AT_location  (DW_OP_fbreg +24)
>                   DW_AT_name  ("c")
>
>
> Referencing the stack here is probably expected for non optimized code, but
> not in this case as main is optimized and doesn't use the stack to push
> arguments.

Hmm - I don't see the problematic behavior in gdb... oh, I guess I do:

fn2 (a=72704, b=32767, c=-10488) at call_site.c:10
10      int fn2(int a, int b, int c) {

I think that's probably a different bug - if you step into the function and
print the parameters, they are correct. gdb might be breaking too early, before
the values have been pulled from the ABI registers and put into the local stack
(nothing to do with the caller putting things on the stack - the caller's still
meant to justn put them in the ABI registers as it does correctly with optnone
on the function, preventing the interprocedural dead arg behavior)

Maybe the prologue isn't being marked correctly, or marked too eraly, or gdb is
ignoring it in some way - such that it stops before the stack's been populated.

Hmm, prologue seems to be in the right place.

"b fn2
r"

Does get you to a place where the parameters are printed correctly.

Wonder what logic is causing "step" to stop earlier than the usual function
breakpoint?
Quuxplusone commented 4 years ago
I used lldb, not gdb here.

This seems to be somehow related to a tail call:

int main(void) {
 return fn2(14, 23, 34) + 1;
}

This will print the parameters correctly on entry.
Quuxplusone commented 4 years ago
Yes, the problem seems to be where lldb stops when stepping into the function.
With the tail call stepping into fn2 shows this output for "di", which already
populated the stack:

(lldb) di
a.out`fn2:
    0x401120 <+0>:  subq   $0x28, %rsp
    0x401124 <+4>:  movq   %rdi, 0x8(%rsp)
    0x401129 <+9>:  movq   %rsi, 0x20(%rsp)
    0x40112e <+14>: movq   %rdx, 0x18(%rsp)
->  0x401133 <+19>: movq   0x8(%rsp), %rax

Without the tail call, the stack is not yet populated, leading to incorrect
values during debugging as the debug info refers to the stack values:

(lldb) di
a.out`fn2:
->  0x401120 <+0>:  subq   $0x28, %rsp
    0x401124 <+4>:  movq   %rdi, 0x8(%rsp)
    0x401129 <+9>:  movq   %rsi, 0x20(%rsp)
    0x40112e <+14>: movq   %rdx, 0x18(%rsp)
    0x401133 <+19>: movq   0x8(%rsp), %rax
Quuxplusone commented 4 years ago
(In reply to Tobias Bosch from comment #23)
> Yes, the problem seems to be where lldb stops when stepping into the
> function. With the tail call stepping into fn2 shows this output for "di",
> which already populated the stack:
>
> (lldb) di
> a.out`fn2:
>     0x401120 <+0>:  subq   $0x28, %rsp
>     0x401124 <+4>:  movq   %rdi, 0x8(%rsp)
>     0x401129 <+9>:  movq   %rsi, 0x20(%rsp)
>     0x40112e <+14>: movq   %rdx, 0x18(%rsp)
> ->  0x401133 <+19>: movq   0x8(%rsp), %rax
>
> Without the tail call, the stack is not yet populated, leading to incorrect
> values during debugging as the debug info refers to the stack values:
>
> (lldb) di
> a.out`fn2:
> ->  0x401120 <+0>:  subq   $0x28, %rsp
>     0x401124 <+4>:  movq   %rdi, 0x8(%rsp)
>     0x401129 <+9>:  movq   %rsi, 0x20(%rsp)
>     0x40112e <+14>: movq   %rdx, 0x18(%rsp)
>     0x401133 <+19>: movq   0x8(%rsp), %rax

yeah, guess gdb has the same issue - be nice to fix at least lldb, but at least
that's an orthogonal issue to worry about elsewhere, perhaps.

Working up an example with hand-crafted assembly for the possible fix to the
original issue in this bug.
Quuxplusone commented 4 years ago

Attached call_site.s (25767 bytes, text/plain): hand-crafted DWARF assembly for proposed solution

Quuxplusone commented 4 years ago

Hmm, yeah, gdb does seem to understand some uses of call_value/entry_value - so I probably messed something up in my example.

Quuxplusone commented 4 years ago
This produces the following debug info:

0x0000006a:   DW_TAG_subprogram
                DW_AT_name  ("fn2")

0x0000007a:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)
                  DW_AT_name    ("a")

0x00000087:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value)
                  DW_AT_name    ("b")

0x00000094:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg1 RDX), DW_OP_stack_value)
                  DW_AT_name    ("c")

0x000000c0:   DW_TAG_subprogram
                DW_AT_name  ("main")

0x000000cf:     DW_TAG_call_site
                  DW_AT_call_origin (0x0000006a)
                  DW_AT_call_tail_call  (true)
                  DW_AT_call_pc (0x0000000000401140)

0x000000d5:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg5 RDI)
                    DW_AT_call_value    (DW_OP_constu 0x22)

0x000000db:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg4 RSI)
                    DW_AT_call_value    (DW_OP_lit23)

0x000000e0:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg1 RDX)
                    DW_AT_call_value    (DW_OP_lit14)

However, when debugging with lldb (from head), it still shows that these
parameters are optimized out:

(lldb) s
Process 58604 stopped
* thread #1, name = 'a.out', stop reason = step in
    frame #0: 0x0000000000401120 a.out`fn2(a=<unavailable>, b=<unavailable>, c=<unavailable>) at call_site.c:11:10
Quuxplusone commented 4 years ago
(In reply to Tobias Bosch from comment #27)
> This produces the following debug info:
>
> 0x0000006a:   DW_TAG_subprogram
>                 DW_AT_name    ("fn2")
>
> 0x0000007a:     DW_TAG_formal_parameter
>                   DW_AT_location  (DW_OP_entry_value(DW_OP_reg5 RDI),
> DW_OP_stack_value)
>                   DW_AT_name  ("a")
>
> 0x00000087:     DW_TAG_formal_parameter
>                   DW_AT_location  (DW_OP_entry_value(DW_OP_reg4 RSI),
> DW_OP_stack_value)
>                   DW_AT_name  ("b")
>
> 0x00000094:     DW_TAG_formal_parameter
>                   DW_AT_location  (DW_OP_entry_value(DW_OP_reg1 RDX),
> DW_OP_stack_value)
>                   DW_AT_name  ("c")
>
>
>
> 0x000000c0:   DW_TAG_subprogram
>                 DW_AT_name    ("main")
>
> 0x000000cf:     DW_TAG_call_site
>                   DW_AT_call_origin   (0x0000006a)
>                   DW_AT_call_tail_call    (true)
>                   DW_AT_call_pc   (0x0000000000401140)
>
> 0x000000d5:       DW_TAG_call_site_parameter
>                     DW_AT_location    (DW_OP_reg5 RDI)
>                     DW_AT_call_value  (DW_OP_constu 0x22)
>
> 0x000000db:       DW_TAG_call_site_parameter
>                     DW_AT_location    (DW_OP_reg4 RSI)
>                     DW_AT_call_value  (DW_OP_lit23)
>
> 0x000000e0:       DW_TAG_call_site_parameter
>                     DW_AT_location    (DW_OP_reg1 RDX)
>                     DW_AT_call_value  (DW_OP_lit14)
>
>
> However, when debugging with lldb (from head), it still shows that these
> parameters are optimized out:
>
> (lldb) s
> Process 58604 stopped
> * thread #1, name = 'a.out', stop reason = step in
>     frame #0: 0x0000000000401120 a.out`fn2(a=<unavailable>, b=<unavailable>,
> c=<unavailable>) at call_site.c:11:10

if you happen to figure out why lldb thinks they're unavailable - I'm all ears.

I'll probably look into why gdb can handle other cases but decides this one
isn't acceptable... but that might be a bit beyond my gdb understanding.
Quuxplusone commented 4 years ago
Hi,

gdb uses the caller info (DW_TAG_call_site/call_site_param) to print @entry
(special value that represents the value of parameter at entry point in the
function) info in backtraces. Furthermore, if we generate a location entry with
DW_OP_entry_value within callee, it will allow us to use the @entry value as
actual value in the callee (typically used for unmodified params, but it can be
used for modified as well, if we can express the modification in terms of its
entry value). So, the debug entry values feature has two purposes/uses indeed.

lldb does not print @entry in backtraces at all, but it uses it in the
situation when DW_OP_entry_value is available (so we need both, caller and
callee info at the time) to retrieve the actual value of a parameter when
possible.

The [0] is the proposal for how to improve this situation within LLVM.
Currently, we generate DW_OP_entry_values inputs within .debug_loc only within
LiveDebugValues, but it seems we can use it in DeadArgElim pass as well for
unused args.

In this particular test case, I see that we have incorrect info at caller
side... (we've jumped into corner case now, the DW_OP_entry_value within
call_value attribute of the corresponding dw-tag-param depends on register-
value which is optimized out, and we need to fix it; (special case:
DW_OP_entry_value can be used at caller side as well; within some cases when
trying to express param value at the call site)).

[0] https://reviews.llvm.org/D85012
Quuxplusone commented 4 years ago

Djordje, I tried your patch and it does produce the correct DW_AT_location in the DW_TAG_formal_parameter of "fn2".

Do you have an idea how to fix the invalid values in the DW_AT_call_value of the DW_TAG_call_site_parameter tags of "main"?

Quuxplusone commented 4 years ago
I looked at this a bit more. Using Djordje's patch that gets the
DW_TAG_formal_parameter right, and a local hack to clang that always uses
DW_AT_call_value (DW_OP_lit23) for all DW_TAG_call_site_parameter I was able to
verify that once we fix the debug info, lldb (-O3 -gdwarf-5) and gdb (-O3 -g)
will do the right thing and show the parameters as no longer "optimized out"
but with the correct value from debug info.

Debug info:

0x00000072:   DW_TAG_subprogram
                DW_AT_name  ("fn2")

0x00000082:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)
                  DW_AT_name    ("a")

0x0000008f:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value)
                  DW_AT_name    ("b")

0x0000009c:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg1 RDX), DW_OP_stack_value)
                  DW_AT_name    ("c")

0x000000aa:   DW_TAG_subprogram
                DW_AT_name  ("main")

0x000000c2:     DW_TAG_call_site
                  DW_AT_call_origin (0x00000072)
                  DW_AT_call_return_pc  (0x0000000000401136)

0x000000c8:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg5 RDI)
                    DW_AT_call_value    (DW_OP_lit23)

0x000000cd:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg4 RSI)
                    DW_AT_call_value    (DW_OP_lit23)

0x000000d2:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg1 RDX)
                    DW_AT_call_value    (DW_OP_lit23)

0x00000095:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value)
                  DW_AT_name    ("b")

In the corresponding DW_TAG_call_site in main:
0x000000d3:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg4 RSI)
                    DW_AT_call_value    (DW_OP_lit23)

Debugger output when stepping into f1:

(lldb) s
Process 571769 stopped
* thread #1, name = 'a.out', stop reason = step in
    frame #0: 0x0000000000401120 a.out`fn2(a=23, b=23, c=23) at test.cc:11:10

(gdb) s
fn2 (a=23, b=23, c=23) at test.cc:11
Quuxplusone commented 4 years ago
Sorry, this part of the previous comment should be ignored:

> In the corresponding DW_TAG_call_site in main:
> 0x000000d3:       DW_TAG_call_site_parameter
>                     DW_AT_location    (DW_OP_reg4 RSI)
>                     DW_AT_call_value  (DW_OP_lit23)
Quuxplusone commented 4 years ago
Creating the DW_TAG_call_site_parameter tags seems to have more than the bug
above (that the DW_AT_call_value is incorrect):

If the last function of a statement has unused parameters, then the
corresponding DW_TAG_call_site only contains DW_TAG_call_site_parameter for the
non optimized out parameters. I.e. the number of DW_TAG_call_site_parameter
does not match the number of DW_TAG_formal_parameter in this case.

Example:

__attribute__((noinline))
int fn2(int a, int b, int c) {
  return a*a;
}

int main(void) {
  int x = fn2(1,2,3) * fn2(4,5,6);
  return x*x;
}

Debug info:

0x00000051:   DW_TAG_subprogram
                DW_AT_name  ("main")

0x00000069:     DW_TAG_call_site
                  DW_AT_call_origin (0x00000027)
                  DW_AT_call_return_pc  (0x000000000040112b)

0x0000006f:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg5 RDI)
                    DW_AT_call_value    (DW_OP_lit1)

0x00000074:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg4 RSI)
                    DW_AT_call_value    (DW_OP_entry_value(DW_OP_reg4 RSI))

0x0000007b:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg1 RDX)
                    DW_AT_call_value    (DW_OP_entry_value(DW_OP_reg1 RDX))

0x00000083:     DW_TAG_call_site
                  DW_AT_call_origin (0x00000027)
                  DW_AT_call_return_pc  (0x0000000000401137)

0x00000089:       DW_TAG_call_site_parameter
                    DW_AT_location  (DW_OP_reg5 RDI)
                    DW_AT_call_value    (DW_OP_lit4)
Quuxplusone commented 4 years ago
>Creating the DW_TAG_call_site_parameter tags seems to have more than the bug
above (that the DW_AT_call_value is incorrect):

>If the last function of a statement has unused parameters, then the
corresponding DW_TAG_call_site only contains DW_TAG_call_site_parameter for the
non optimized out parameters. I.e. the number of DW_TAG_call_site_parameter
does not match the number of DW_TAG_formal_parameter in this case.

I don't think its a bug. There are cases where we do not express all the call-
site params values and it is OK. We ended up describing target dependent
instructions that load param value, very late in the pipeline, in terms of
DWARF and it ends up in call_val (or call_site_param_val) DWARF attribute.
There is support for some instructions, but there is place for improvements. If
we don't handle some instructions, we won't generate corresponding call_val.
Here is an experimental project that helps in finding these opportunities for
improvements: https://github.com/djolertrk/call-site-params-stats.

== About the case from this thread

The conclusion is that the call value is wrong in this case.
The call value (within main()) depends on entry_val operation, but the register
value (from the entry_val expr) is not alive at the entry point of the fn2().
Maybe we need something like this: https://reviews.llvm.org/D85111 (quick-made-
patch; I need to think a bit about it...)

Anyhow, why didn't we generate call_values with corresponding constants? The
main() function before Dead Arg Eliminitaion looks as:

define dso_local i32 @main() local_unnamed_addr #1 !dbg !36 {
entry:
  %call = call i64 @fn2(i64 14, i64 23, i64 34), !dbg !39
  %conv = trunc i64 %call to i32, !dbg !39
  ret i32 %conv, !dbg !40
}

After the pass:
define dso_local i32 @main() local_unnamed_addr #1 !dbg !36 {
entry:
  %call = call i64 @fn2(i64 undef, i64 undef, i64 undef), !dbg !39
  %conv = trunc i64 %call to i32, !dbg !39
  ret i32 %conv, !dbg !40
}

so, that is the last spot the call_values are available. Furthermore, at the
end, we ended up having machine function as:

# Machine code for function main: NoPHIs, TracksLiveness, NoVRegs,
TiedOpsRewritten

bb.0.entry:
  TAILJMPd64 @fn2, <regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...>, implicit $rsp, implicit $ssp, implicit $rsp, implicit $ssp, implicit undef $rdi, implicit undef $rsi, implicit undef $rdx, debug-location !38; test.c:16:10

# End machine code for function main.

and there is no instruction that can help us with describing the call value in
terms of DWARF, at the end of pipeline (so with the current approach, we can't
describe it).
Quuxplusone commented 4 years ago
Djordje, is it possible that your patch that fills the DW_AT_location of unused
DW_TAG_formal_parameter doesn't work for inlined functions?

Example (f1 contains multiple statements so we can step at least 2x inside of
it):

int x = 12;

int f1(int i) {
  int y = x;
  y = y*y;
  return y+1;
}

int main(void) {
  return f1(23)+1;
}

DWARF:
0x00000053:   DW_TAG_subprogram
                DW_AT_name  ("f1")
                DW_AT_inline    (DW_INL_inlined)

0x0000005d:     DW_TAG_formal_parameter
                  DW_AT_name    ("i")
Quuxplusone commented 4 years ago
(In reply to Tobias Bosch from comment #35)
> Djordje, is it possible that your patch that fills the DW_AT_location of
> unused DW_TAG_formal_parameter doesn't work for inlined functions?
>
> Example (f1 contains multiple statements so we can step at least 2x inside
> of it):
>
4> int x = 12;
>
> int f1(int i) {
>   int y = x;
>   y = y*y;
>   return y+1;
> }
>
> int main(void) {
>   return f1(23)+1;
> }
>
> DWARF:
> 0x00000053:   DW_TAG_subprogram
>                 DW_AT_name    ("f1")
>                 DW_AT_inline  (DW_INL_inlined)
>
> 0x0000005d:     DW_TAG_formal_parameter
>                   DW_AT_name  ("i")

inlined variables are quite different to real parameters - in this case the
inlined_subroutine's formal_parameter should have a location with the constant
value 23 (. No need for call_site_value/entry_value stuff.

For the non-inlined copy of 'f1'. you might not be looking at the right DIEs

You've quoted the abstract subprogram (this contains the common attributes used
across all inline and non-inline copies of the function (this distinction only
exists for functions that have at least one inlining)) - it has no locations
for the function (high/low/ranges) or its variables.

If you look for the subprogram that refers to the one you've quoted (0x53) as
its abstract_origin, you'll find something like this:

0x00000046:   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000000000)
                DW_AT_high_pc   (0x000000000000000d)
                DW_AT_frame_base        (DW_OP_reg7 RSP)
                DW_AT_GNU_all_call_sites        (true)
                DW_AT_abstract_origin   (0x00000068 "_Z2f1i")

0x00000059:     DW_TAG_formal_parameter
                  DW_AT_abstract_origin (0x00000078 "i")

0x0000005e:     DW_TAG_variable
                  DW_AT_location        (0x00000000:
                     [0x0000000000000006, 0x000000000000000c): DW_OP_reg0 RAX)
                  DW_AT_abstract_origin (0x00000083 "y")

(this is without Djordje's patch applied - perhaps with that patch applied the
formal_parameter DIE here does have a location using entry_value?)
Quuxplusone commented 4 years ago
Ah, thanks.

With Djordje's patch the DIEs looks like this:

0x00000036:   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000401110)
                DW_AT_high_pc   (0x000000000040111d)
                DW_AT_abstract_origin   (0x00000053 "_Z2f1i")

0x00000042:     DW_TAG_formal_parameter
                  DW_AT_location    (DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)
                  DW_AT_abstract_origin (0x0000005d "i")

0x0000004c:     DW_TAG_variable
                  DW_AT_location    (indexed (0x0) loclist = 0x00000014:
                     [0x0000000000401116, 0x000000000040111c): DW_OP_reg0 RAX)
                  DW_AT_abstract_origin (0x00000065 "y")

0x00000052:     NULL

0x00000053:   DW_TAG_subprogram
                DW_AT_linkage_name  ("_Z2f1i")
                DW_AT_name  ("f1")
                DW_AT_inline    (DW_INL_inlined)

0x0000005d:     DW_TAG_formal_parameter
                  DW_AT_name    ("i")

0x00000065:     DW_TAG_variable
                  DW_AT_name    ("y")

0x0000006e:   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000401120)
                DW_AT_high_pc   (0x000000000040112d)
                DW_AT_name  ("main")

0x0000007d:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin (0x00000053 "_Z2f1i")
                  DW_AT_low_pc  (0x0000000000401120)
                  DW_AT_high_pc (0x0000000000401129)

0x0000008a:       DW_TAG_formal_parameter
                    DW_AT_abstract_origin   (0x0000005d "i")

So the DW_TAG_formal_parameter seems fine even with inlining, but there is no
DW_TAG_call_site_parameter. I.e. debugging looks like this when breaking on
main:

(lldb) r
Process 2211681 launched: '/usr/local/google/home/tbosch/llvm-
project/build/a.out' (x86_64)
Process 2211681 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000000000401120 a.out`main [inlined] f1(i=<unavailable>) at test2.cc:4:11
   1    int x = 12;
   2
   3    int f1(int i) {
-> 4      int y = x;
   5      y = y*y;
   6      return y+1;
   7    }
Quuxplusone commented 4 years ago
(In reply to Tobias Bosch from comment #37)
> Ah, thanks.
>
> With Djordje's patch the DIEs looks like this:
>
> 0x00000036:   DW_TAG_subprogram
>                 DW_AT_low_pc  (0x0000000000401110)
>                 DW_AT_high_pc (0x000000000040111d)
>                 DW_AT_abstract_origin (0x00000053 "_Z2f1i")
>
> 0x00000042:     DW_TAG_formal_parameter
>                   DW_AT_location  (DW_OP_entry_value(DW_OP_reg5 RDI),
> DW_OP_stack_value)
>                   DW_AT_abstract_origin   (0x0000005d "i")
>
> 0x0000004c:     DW_TAG_variable
>                   DW_AT_location  (indexed (0x0) loclist = 0x00000014:
>                      [0x0000000000401116, 0x000000000040111c): DW_OP_reg0
> RAX)
>                   DW_AT_abstract_origin   (0x00000065 "y")
>
> 0x00000052:     NULL
>
> 0x00000053:   DW_TAG_subprogram
>                 DW_AT_linkage_name    ("_Z2f1i")
>                 DW_AT_name    ("f1")
>                 DW_AT_inline  (DW_INL_inlined)
>
> 0x0000005d:     DW_TAG_formal_parameter
>                   DW_AT_name  ("i")
>
> 0x00000065:     DW_TAG_variable
>                   DW_AT_name  ("y")
>
> 0x0000006e:   DW_TAG_subprogram
>                 DW_AT_low_pc  (0x0000000000401120)
>                 DW_AT_high_pc (0x000000000040112d)
>                 DW_AT_name    ("main")
>
> 0x0000007d:     DW_TAG_inlined_subroutine
>                   DW_AT_abstract_origin   (0x00000053 "_Z2f1i")
>                   DW_AT_low_pc    (0x0000000000401120)
>                   DW_AT_high_pc   (0x0000000000401129)
>
> 0x0000008a:       DW_TAG_formal_parameter
>                     DW_AT_abstract_origin (0x0000005d "i")
>
>
> So the DW_TAG_formal_parameter seems fine even with inlining, but there is
> no DW_TAG_call_site_parameter.

Yeah, there shouldn't be a call_site in this case - because the call has been
inlined. The DW_AT_location on the the 0x8a DW_TAG_formal_parameter should have
a DW_AT_location with the literal 23.

If the Dead Argument Elimination happens before inlining - then the only place
the 23 would remain alive would be in is whatever the IR Metadata is for the
call site - and would have to be transformed into the inlined parameter's
location (assuming the location isn't clobbered) - essentially collapsing the
entry_value + call_site_parameter down into a location. (inlining the parameter
value, essentially)

> I.e. debugging looks like this when breaking
> on main:
>
> (lldb) r
> Process 2211681 launched:
> '/usr/local/google/home/tbosch/llvm-project/build/a.out' (x86_64)
> Process 2211681 stopped
> * thread #1, name = 'a.out', stop reason = breakpoint 1.1
>     frame #0: 0x0000000000401120 a.out`main [inlined] f1(i=<unavailable>) at
> test2.cc:4:11
>    1      int x = 12;
>    2
>    3      int f1(int i) {
> -> 4        int y = x;
>    5        y = y*y;
>    6        return y+1;
>    7      }
Quuxplusone commented 4 years ago

The call site info (and entry values as well, since it depends on it) should be avoided for inlined calls. We should support only normal, tail and indirect calls (according to DWARF standard).