Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Wrong debug info for step at -O1 #45010

Open Quuxplusone opened 4 years ago

Quuxplusone commented 4 years ago
Bugzilla Link PR46040
Status NEW
Importance P enhancement
Reported by Yibiao Yang (yangyibiao@nju.edu.cn)
Reported on 2020-05-22 07:56:30 -0700
Last modified on 2020-06-01 07:21:10 -0700
Version trunk
Hardware PC Linux
CC dblaikie@gmail.com, jdevlieghere@apple.com, keith.walker@arm.com, llvm-bugs@lists.llvm.org, orlando.hyams@sony.com, paul_robinson@playstation.sony.com
Fixed by commit(s)
Attachments a.out (18024 bytes, application/x-executable)
Blocks
Blocked by
See also
Created attachment 23522
the binary

$ clang --version
clang version 11.0.0 (/home/yibiao/.cache/yay/llvm-git/llvm-project
871beba234a83a2a02da9dedbd59b91a1bfbd7af)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ lldb --version
lldb version 11.0.0
  clang revision 871beba234a83a2a02da9dedbd59b91a1bfbd7af
  llvm revision 871beba234a83a2a02da9dedbd59b91a1bfbd7af

$ clang -O1 -g small.c

$ lldb a.out
(lldb) target create "a.out"
Current executable set to '/home/yibiao/Debugger/a.out' (x86_64).
(lldb) b main
Breakpoint 1: where = a.out`main + 1 at small.c:19:3, address =
0x00000000004011d1
(lldb) r
Process 35712 launched: '/home/yibiao/Debugger/a.out' (x86_64)
Process 35712 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x00000000004011d1 a.out`main at small.c:19:3
   16   }
   17
   18   int main() {
-> 19     f(1);
   20   }
   21
(lldb) step
Process 35712 stopped
* thread #1, name = 'a.out', stop reason = step in
    frame #0: 0x0000000000401170 a.out`f(n=<unavailable>) at small.c:8:18
   5      char *end;
   6      int i;
   7
-> 8      for(i=0; i<2; i++) {
   9        va_start(ap, n);
   10       while (1) {
   11         end = va_arg(ap, char *);
(lldb) step
Process 35712 stopped
* thread #1, name = 'a.out', stop reason = step in
    frame #0: 0x0000000000401179 a.out`f(n=<unavailable>) at small.c:9:5
   6      int i;
   7
   8      for(i=0; i<2; i++) {
-> 9        va_start(ap, n);
   10       while (1) {
   11         end = va_arg(ap, char *);
   12         if (!end) break;
(lldb) step
Process 35712 stopped
* thread #1, name = 'a.out', stop reason = step in
    frame #0: 0x00000000004011a4 a.out`f(n=<unavailable>) at small.c:11:13
   8      for(i=0; i<2; i++) {
   9        va_start(ap, n);
   10       while (1) {
-> 11         end = va_arg(ap, char *);
   12         if (!end) break;
   13       }
   14       va_end(ap);
(lldb) fr var i
(int) i = 1
(lldb)

/************************************
As showed, when step to line 11 (first hit), the value of "i" is equal to 1.
while step-i to line 11 (first hit), the value of "i" is 0, which is as
expected.
************************************/

$ clang -O1 -g small.c;lldb a.out
(lldb) target create "a.out"
Current executable set to '/home/yibiao/Debugger/a.out' (x86_64).
(lldb) b main
Breakpoint 1: where = a.out`main + 1 at small.c:19:3, address =
0x00000000004011d1
(lldb) r
Process 35938 launched: '/home/yibiao/Debugger/a.out' (x86_64)
Process 35938 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x00000000004011d1 a.out`main at small.c:19:3
   16   }
   17
   18   int main() {
-> 19     f(1);
   20   }
   21
(lldb) si -c 19
Process 35938 stopped
* thread #1, name = 'a.out', stop reason = instruction step into
    frame #0: 0x00000000004011a4 a.out`f(n=<unavailable>) at small.c:11:13
   8      for(i=0; i<2; i++) {
   9        va_start(ap, n);
   10       while (1) {
-> 11         end = va_arg(ap, char *);
   12         if (!end) break;
   13       }
   14       va_end(ap);
(lldb) fr var i
(int) i = 0
(lldb)
Quuxplusone commented 4 years ago

Attached a.out (18024 bytes, application/x-executable): the binary

Quuxplusone commented 4 years ago
Sorry. Forgot to attach the code.

$ cat small.c
#include <stdarg.h>

void f(int n, ...){
  va_list ap;
  char *end;
  int i;

  for(i=0; i<2; i++) {
    va_start(ap, n);
    while (1) {
      end = va_arg(ap, char *);
      if(!end) break;
    }
    va_end(ap);
  }
}

int main() {
  f(1);
}
Quuxplusone commented 4 years ago
Nice find! I get the feeling that the variable locations for 'i' are correct,
but the line table is
messed up. It looks like the prologue_end flag is on a misleading line, and we
may have a misleading
line number for the first instruction in the final for loop block.

$ cat -n test.c
 1  #include <stdarg.h>
 2
 3  void f(int n, ...){
 4    va_list ap;
 5    char *end;
 6    int i;
 7
 8    for(i=0; i<2; i++) {
 9      va_start(ap, n);
10      while (1) {
11        end = va_arg(ap, char *);
12        if(!end) break;
13      }
14      va_end(ap);
15    }
16  }
17
18  int main() {
19    f(1);
20  }

Using clang from the 29th May 2020.
$ clang --version
clang version 11.0.0 (92063228f85bfe22a6dfe20bf01c99ffe6ff3130)
Target: x86_64-unknown-linux-gnu

$ clang test.c -O1 -g
$ llvm-dwarfdump a.out -name i
0x00000059: DW_TAG_variable
              DW_AT_location    (0x00000000:
                 [0x00000000004004ca, 0x00000000004004e0): DW_OP_consts +0, DW_OP_stack_value
                 [0x00000000004004e0, 0x00000000004004e3): DW_OP_reg2 RCX
                 [0x00000000004004e3, 0x00000000004004e7): DW_OP_reg1 RDX
                 [0x00000000004004e7, 0x0000000000400539): DW_OP_reg2 RCX)
              DW_AT_name    ("i")

$ llvm-dwarfdump --debug-line a.out
Address            Line   Column File   ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x0000000000400480      3      0      1   0             0  is_stmt
0x00000000004004e0      8     18      1   0             0  is_stmt prologue_end
0x00000000004004e3      8     13      1   0             0
0x00000000004004e7      8      3      1   0             0
0x00000000004004e9      9      5      1   0             0  is_stmt
...

Using gdb, if you step with si (step to next machine instruction) into 'f' and
keep going until you
hit a line which is part of the for loop, you'll hit the following instruction.
a

*------------------------*---------------------------------------------------------*------------------------------------*
| line table             | disassembly
| location for "i" (+ current value) |
*------------------------*---------------------------------------------------------*------------------------------------*
| ...                    |  ...
| undef                              |
| 9 is_stmt              |  0x4004e9 <f+105>   mov    QWORD PTR [rsp-0x70],rax
| RCX (0)                            |
*------------------------*---------------------------------------------------------*------------------------------------*

Then continuing round the loop, variable 'i' eventually increments as you'd
expect.

If you instead step into 'f' with step (step to next source line), you start at
the end of the
prologue, according to the line table.

*------------------------*---------------------------------------------------------*------------------------------------*
| line table             | disassembly
| location for "i" (+ current value) |
*------------------------*---------------------------------------------------------*------------------------------------*
| 8 is_stmt prologue_end |  0x4004e0 <f+96>    lea    edx,[rcx+0x1]
| RCX (0)                            |
| 8                      |  0x4004e3 <f+99>    test   ecx,ecx
| RCX (0)                            |
| 8                      |  0x4004e5 <f+101>   mov    ecx,edx
| RCX (1)                            |
| 8                      |  0x4004e7 <f+103>   jne    0x400534 <f+180>
| RCX (1)                            |
*==== step
=============================================================================================================*
| 9                      |  0x4004e9 <f+105>   mov    QWORD PTR [rsp-0x70],rax
| RCX (1)                            |
*------------------------*---------------------------------------------------------*------------------------------------*

From my initial look, I think there are two problems at play:

1) Looking at the source, you'd expect line 8 to be roughly where the prologue
ends. However, AFAICT
the instruction at 0x4004e0 comes from the final block of the outer loop. This
means we essentially
skip the first iteration of the loop when stepping through with 'step'.

2) After the MIR pass "Branch Probability Basic Block Placement" (-block-
placement), the final for
loop block is moved to near the top of the function. Before this block there 3
others including
entry. None of the instructions in those other blocks have a DebugLoc, so the
first line number we
encounter comes from the final while block. I don't how the prologue_end is
calculated but this set
of circumstances looks suspicious.
Quuxplusone commented 4 years ago
(In reply to Orlando Cazalet-Hyams from comment #2)
> 2) After the MIR pass "Branch Probability Basic Block Placement"
> (-block-placement), the final for
> loop block is moved to near the top of the function. Before this block there
> 3 others including
> entry. None of the instructions in those other blocks have a DebugLoc, so
> the first line number we
> encounter comes from the final while block. I don't how the prologue_end is
> calculated but this set
> of circumstances looks suspicious.

prologue_end is the first real instruction that is not marked as FrameSetup
and also has a DebugLoc.  If there are instructions that are incorrectly
missing a DebugLoc, fixing that should fix prologue_end placement.
Quuxplusone commented 4 years ago
Please ignore comment #2, Bugzilla formatted my reply in unexpected ways and I
don't seem to be able to edit or delete it. Here it is again:

Nice find! I get the feeling that the variable locations for 'i' are correct,
but the line table is messed up. It looks like the prologue_end flag is on a
misleading line, and we may have a misleading line number for the first
instruction in the final for loop block.

$ cat -n test.c
 1  #include <stdarg.h>
 2
 3  void f(int n, ...){
 4    va_list ap;
 5    char *end;
 6    int i;
 7
 8    for(i=0; i<2; i++) {
 9      va_start(ap, n);
10      while (1) {
11        end = va_arg(ap, char *);
12        if(!end) break;
13      }
14      va_end(ap);
15    }
16  }
17
18  int main() {
19    f(1);
20  }

Using clang from the 29th May 2020.
$ clang --version
clang version 11.0.0 (92063228f85bfe22a6dfe20bf01c99ffe6ff3130)
Target: x86_64-unknown-linux-gnu

$ clang test.c -O1 -g
$ llvm-dwarfdump a.out -name i
0x00000059: DW_TAG_variable
              DW_AT_location    (0x00000000:
                 [0x00000000004004ca, 0x00000000004004e0): DW_OP_consts +0, DW_OP_stack_value
                 [0x00000000004004e0, 0x00000000004004e3): DW_OP_reg2 RCX
                 [0x00000000004004e3, 0x00000000004004e7): DW_OP_reg1 RDX
                 [0x00000000004004e7, 0x0000000000400539): DW_OP_reg2 RCX)
              DW_AT_name    ("i")

$ llvm-dwarfdump --debug-line a.out
Address            Line   Column File   ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x0000000000400480      3      0      1   0             0  is_stmt
0x00000000004004e0      8     18      1   0             0  is_stmt prologue_end
0x00000000004004e3      8     13      1   0             0
0x00000000004004e7      8      3      1   0             0
0x00000000004004e9      9      5      1   0             0  is_stmt
...

Using gdb, if you step with si (step to next machine instruction) into 'f' and
keep going until you hit a line which is part of the for loop, you'll hit the
following instruction.
Using gdb, if you step with si (step to next machine instruction) into 'f' and
*------------------------*----------------------------------------*---------*
| line table             | disassembly                            | "i"     |
*------------------------*----------------------------------------*---------*
| ...                    |  ...                                   | undef   |
| 9 is_stmt              | 0x4004e9 mov  QWORD PTR [rsp-0x70],rax | RCX (0) |
*------------------------*----------------------------------------*---------*

Then continuing round the loop, variable 'i' eventually increments as you'd
expect.

If you instead step into 'f' with step (step to next source line), you start at
the end of the prologue, according to the line table.

*------------------------*----------------------------------------*---------*
| line table             | disassembly                            | "i"     |
*------------------------*----------------------------------------*---------*
| 8 is_stmt prologue_end | 0x4004e0 lea  edx,[rcx+0x1]            | RCX (0) |
| 8                      | 0x4004e3 test ecx,ecx                  | RCX (0) |
| 8                      | 0x4004e5 mov  ecx,edx                  | RCX (1) |
| 8                      | 0x4004e7 jne  0x400534 <f+180>         | RCX (1) |
===== step ==================================================================
| 9 is_stmt              | 0x4004e9 mov  QWORD PTR [rsp-0x70],rax | RCX (1) |
*------------------------*----------------------------------------*---------*

From my initial look, I think there are two problems at play:

1) Looking at the source, you'd expect line 8 to be roughly where the prologue
ends. However, AFAICT the instruction at 0x4004e0 comes from the final block of
the outer loop. This means we essentially skip the first iteration of the loop
when stepping through with 'step'.

2) After the MIR pass "Branch Probability Basic Block Placement" (-block-
placement), the final for loop block is moved to near the top of the function.
Before this block there 3 others including entry. None of the instructions in
those other blocks have a DebugLoc, so the first line number we encounter comes
from the final while block. I don't how the prologue_end is calculated but this
set of circumstances looks suspicious.

Adding in Paul's reply so it is not hidden by this re-post.
(In reply to Paul Robinson from comment #3)
> (In reply to Orlando Cazalet-Hyams from comment #2)
> > 2) After the MIR pass "Branch Probability Basic Block Placement"
> > (-block-placement), the final for
> > loop block is moved to near the top of the function. Before this block there
> > 3 others including
> > entry. None of the instructions in those other blocks have a DebugLoc, so
> > the first line number we
> > encounter comes from the final while block. I don't how the prologue_end is
> > calculated but this set
> > of circumstances looks suspicious.
>
> prologue_end is the first real instruction that is not marked as FrameSetup
> and also has a DebugLoc.  If there are instructions that are incorrectly
> missing a DebugLoc, fixing that should fix prologue_end placement.