llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.96k stars 11.94k forks source link

Wrong Debug Information for Oz. Stepping on dead code. #45279

Open 31fbadee-9e60-4c99-b908-8380e6a40c85 opened 4 years ago

31fbadee-9e60-4c99-b908-8380e6a40c85 commented 4 years ago
Bugzilla Link 45934
Version trunk
OS Linux
CC @adrian-prantl,@dwblaikie,@DougGregor,@JDevlieghere,@jmorse,@pogo59,@pogo59,@zygoloid

Extended Description

The debugger shows that line 2 is executed while it shouldn't be.

$ cat a.c int add(int ui1, int ui2) { return ui1 + ui2; } int g_8 ; int a = 4 ; int c ; int main() { for (g_8 = 0; (g_8 < 49); g_8 = add(g_8, 9)) { for (; c ; c++) ; if (a) break; } }

$ clang -v clang version 11.0.0 (https://github.com/llvm/llvm-project.git c25b20c0f6c13d68dbc2e185764082d61ae4a132) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/bin Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8 Found candidate GCC installation: /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/11.0.0 Selected GCC installation: /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/11.0.0 Candidate multilib: .;@m64 Selected multilib: .;@m64

$ lldb -v lldb version 11.0.0 clang revision c25b20c0f6c13d68dbc2e185764082d61ae4a132 llvm revision c25b20c0f6c13d68dbc2e185764082d61ae4a132

$ gdb -v GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git

$ clang -Oz -g -o opt a.c

$ lldb opt (lldb) target create "opt" Current executable set to 'opt' (x86_64). (lldb) b -l 2 Breakpoint 1: 2 locations. (lldb) r Process 145 launched: 'opt' (x86_64) Process 145 stopped

$ gdb opt (gdb) b -l 2 Breakpoint 1 at 0x400474: -source a.c -line 2. (2 locations) (gdb) r Starting program: opt

Breakpoint 1, main () at a.c:9 9 for (g_8 = 0; (g_8 < 49); g_8 = add(g_8, 9)) { (gdb) s add (ui1=0, ui2=9) at a.c:2 2 return ui1 + ui2; (gdb)

pogo59 commented 4 years ago

I think that's just a quirk/bug in the LLVM integrated assembler - I don't think of this as a violation of the DWARF spec, but as a degenerate output. It describes an empty range, (perhaps contradictory, but if you think of the range [N, N) as being empty, which C++ does, and I think mathematics in general does (how else would you describe an empty range using a half-open range?)) rather than describing two locations for the same address.

It's not a violation of the letter of the spec, but it clearly violates the intent. As such I doubt debuggers would proactively look for these cases and do something intelligent with them. I don't think it's something we can really condone as a useful bit of output.

But that's a side point, the primary question in this bug is about the mysterious stopping on line 2. My line table didn't show any line-2 instructions inlined into the body of main, although maybe others do.

dwblaikie commented 4 years ago

Address Line Column File ISA Discriminator Flags


0x40047c 8 0 1 0 0 is_stmt 0x40047e 0 0 1 0 0 is_stmt prologue_end 0x400487 9 22 1 0 0 is_stmt 0x40048a 9 3 1 0 0

Huh. In mine I see

0x40047c 8 0 1 0 0 is_stmt 0x40047c 0 0 1 0 0 is_stmt prologue_end

that is, two entries for the same instruction. Which seems wrong-er than just having prologue_end on a line-0 entry; the DWARF line table is modeled on having one row per instruction, so this effect violated the intent.

I think that's just a quirk/bug in the LLVM integrated assembler - I don't think of this as a violation of the DWARF spec, but as a degenerate output. It describes an empty range, (perhaps contradictory, but if you think of the range [N, N) as being empty, which C++ does, and I think mathematics in general does (how else would you describe an empty range using a half-open range?)) rather than describing two locations for the same address.

jmorse commented 4 years ago

Does this vary by target, maybe? I'm doing my test on Ubuntu, and apparently without -ffunction-sections.

Most curious; I'm using Ubuntu 18.04 and don't see that, clang as specified and GNU LD 2.30. I do see it for the preceeding function however, here's a less truncated -debug-line output:

0x400478 2 0 1 0 0 is_stmt 0x400478 3 14 1 0 0 is_stmt prologue_end 0x40047b 3 3 1 0 0
0x40047c 9 0 1 0 0 is_stmt 0x40047e 0 0 1 0 0 is_stmt prologue_end 0x400487 10 22 1 0 0 is_stmt

Where the first four bytes are the "add" function, and anything from 7c onwards is "main".

This sounds like Toms stripped-prologue situation: https://reviews.llvm.org/D49426

pogo59 commented 4 years ago

Address Line Column File ISA Discriminator Flags


0x40047c 8 0 1 0 0 is_stmt 0x40047e 0 0 1 0 0 is_stmt prologue_end 0x400487 9 22 1 0 0 is_stmt 0x40048a 9 3 1 0 0

Huh. In mine I see

0x40047c 8 0 1 0 0 is_stmt 0x40047c 0 0 1 0 0 is_stmt prologue_end

that is, two entries for the same instruction. Which seems wrong-er than just having prologue_end on a line-0 entry; the DWARF line table is modeled on having one row per instruction, so this effect violated the intent.

Does this vary by target, maybe? I'm doing my test on Ubuntu, and apparently without -ffunction-sections.

jmorse commented 4 years ago

Some exciting things are going on in this test -- firstly, I think the unconditional presence of line two in the trace is due to loop rotation. The loop increment statement does get unconditionally executed, but at that point it's speculative. I suppose we could delete the DebugLocs from the speculative part, but then the developer may enter the loop body without seeing the first increment.

I'd enjoy hearing other opinions on what to do about that. I think my preference is providing incomplete information to the developer, instead of what seems like incorrect information.

The second exciting thing is that, with clang/llvm e2b134b01a6b1, options "-Oz -g", and gdb, the program starts without a line number:

--------8<-------- Reading symbols from a.out...done. (gdb) start Temporary breakpoint 1 at 0x40047e Starting program: /tmp/a.out

Temporary breakpoint 1, 0x000000000040047e in main () (gdb) -------->8--------

Stepping with 's' steps out of main, without stopping in the rest of the function. Looking at the line table, I see that prologue_end is set on a line-zero entry:

Address Line Column File ISA Discriminator Flags


0x40047c 8 0 1 0 0 is_stmt 0x40047e 0 0 1 0 0 is_stmt prologue_end 0x400487 9 22 1 0 0 is_stmt 0x40048a 9 3 1 0 0

This possibly falls into the category of "something weird that gdb does", because lldb with the same file and 'break main; r' stops on 0x400487 and has a line number. (If this strikes anyone as being illegitimate dwarf, lets spin it out into a different ticket).

dwblaikie commented 4 years ago

Gah. So I'm wrong on multiple counts. Apologies for the noise.

As penance, I built the test program using clang 7ee479a7 and it produced:

000000000040047c

: 40047c: 8b 05 a6 0b 20 00 mov 0x200ba6(%rip),%eax # 601028 400482: 8b 0d ac 0b 20 00 mov 0x200bac(%rip),%ecx # 601034

400488: ff c1 inc %ecx 40048a: 83 f9 01 cmp $0x1,%ecx 40048d: 74 08 je 400497 40048f: 89 0d 9f 0b 20 00 mov %ecx,0x200b9f(%rip) # 601034 400495: eb f1 jmp 400488 400497: 85 c0 test %eax,%eax 400499: 74 e7 je 400482 40049b: 83 25 8e 0b 20 00 00 andl $0x0,0x200b8e(%rip) # 601030 <__TMC_END__> 4004a2: 31 c0 xor %eax,%eax 4004a4: c3 retq 4004a5: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 4004ac: 00 00 00 4004af: 90 nop There's no call to add() so no idea why the debugger would show a stop there. gdb doesn't repro stopping at line 2. c25b20c0 is from 11 May; my compiler is from last night. Did it get fixed in the past week?

Looks like it gets inlined https://godbolt.org/z/z-m4GH - but line 65 of the assembly does have the only line attributed to line 2 of the source - and it doesn't look like there's any accidental flow-on debug locations happening or anything.

I haven't acutally tried running it in a debugger, though.

pogo59 commented 4 years ago

Gah. So I'm wrong on multiple counts. Apologies for the noise.

As penance, I built the test program using clang 7ee479a7 and it produced:

000000000040047c

: 40047c: 8b 05 a6 0b 20 00 mov 0x200ba6(%rip),%eax # 601028 400482: 8b 0d ac 0b 20 00 mov 0x200bac(%rip),%ecx # 601034 400488: ff c1 inc %ecx 40048a: 83 f9 01 cmp $0x1,%ecx 40048d: 74 08 je 400497 <main+0x1b> 40048f: 89 0d 9f 0b 20 00 mov %ecx,0x200b9f(%rip) # 601034 400495: eb f1 jmp 400488 <main+0xc> 400497: 85 c0 test %eax,%eax 400499: 74 e7 je 400482 <main+0x6> 40049b: 83 25 8e 0b 20 00 00 andl $0x0,0x200b8e(%rip) # 601030 <__TMC_END__> 4004a2: 31 c0 xor %eax,%eax 4004a4: c3 retq
4004a5: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 4004ac: 00 00 00 4004af: 90 nop

There's no call to add() so no idea why the debugger would show a stop there. gdb doesn't repro stopping at line 2.

c25b20c0 is from 11 May; my compiler is from last night. Did it get fixed in the past week?

llvmbot commented 4 years ago

What's the evaluation order? There's no init, and I expect the condition to be evaluated before the increment.

c = 0, the condition is c, which is equivalent to c != 0, so it's immediately false and you exit the loop without executing any iterations. I'm reading this wrong?

pogo59 commented 4 years ago

How is for (; c; c++) ; fully legal? Either it's infinite, or it causes signed overflow, which is also UB no?

llvmbot commented 4 years ago

Paul, 1) No infinite loop 2) c is a global, hence guaranteed to be initialized to zero.

I would kindly ask you to not be so aggressive marking bugs as INVALID, and if you do so, at least try to provide a convincing explanation so people can avoid making the mistakes in the future (which, in this case, didn't exist). It demotivates the people who're reporting them and it's not good for an inclusive project in general.

dwblaikie commented 4 years ago
  for (; c ; c++)
    ;

'c' read without being initialized infinite loop (although that might be legal in C?)

At least in C++ globals are guaranteed to be initialized to zero. So it's neither a read of uninitialized, nor an infinite loop (to the best of my understanding). Not sure about C, but I'm /guessing/ it's OK with this.

pogo59 commented 4 years ago
  for (; c ; c++)
    ;

'c' read without being initialized infinite loop (although that might be legal in C?)

llvmbot commented 4 years ago

I'm probably really wrong here, but I passed this code through ccomp, and it comes out cleanly. What's the undefined behaviour here?

pogo59 commented 4 years ago

The debugger shows that line 2 is executed while it shouldn't be.

I expect you're saying that because of the "if (a) break;" however the inner for-loop has two forms of undefined behavior, so how the compiler chooses to handle the program is not specified. We can't make proclamations about "correct" debugger behavior in this case.

Resolving as INVALID.

By the way, it would be helpful if you use cat -n a.c which will number the lines, and it will be easier to talk about your example programs.