lldb wrongly stopped at a return statement within a for loop

llvmbot commented 4 years ago


Bugzilla Link	45603
Version	9.0
OS	Linux
Reporter	LLVM Bugzilla Contributor
CC	@dwblaikie,@JDevlieghere,@jimingham,@labath

Extended Description

$ lldb -v lldb version 9.0.1

$ cat small.c int main() { int j = 0, d = 0; if (j>=0) for (; d; d++) return 0; return 0; }

$ clang -g small.c; lldb ./a.out (lldb) target create "./a.out" Current executable set to './a.out' (x86_64). (lldb) b small.c:5 Breakpoint 1: where = a.out`main + 35 at small.c:5:5, address = 0x0000000000001143 (lldb) run Process 877280 launched: '/home/yibiao/cv-gcov/a.out' (x86_64) Process 877280 stopped

thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000555555555143 a.out`main at small.c:5:5 2 { 3 int j = 0, d = 0; 4 if (j>=0) -> 5 for (; d; d++) 6 return 0; 7 return 0; 8 } (lldb) fr v (int) j = 0 (int) d = 0 (lldb) step Process 877280 stopped
thread #1, name = 'a.out', stop reason = step in frame #0: 0x000055555555515e a.out`main at small.c:6:14 3 int j = 0, d = 0; 4 if (j>=0) 5 for (; d; d++) -> 6 return 0; 7 return 0; 8 } (lldb) step Process 877280 stopped
thread #1, name = 'a.out', stop reason = step in frame #0: 0x0000555555555163 a.out`main at small.c:7:3 4 if (j>=0) 5 for (; d; d++) 6 return 0; -> 7 return 0; 8 }

lldb is wrongly stopped at line 6. Before that, d is 0. Thus, the 6th statement should be not executed.

jimingham commented 4 years ago

A lot of the "don't lie to users" uses for line 0 would be obviated if the line tables didn't demand a one to one mapping from address to line number. If you can't figure out whether a common subexpression was triggered by executing line 10, 20 or 30, it is better to say 0 to keep from asserting something you don't know. But it would be even better if we could say "either 10, 20 or 30".

Sadly, that's not how DWARF line tables work...

In the direct case you cite, however, I'm not sure saying line 0 is better than giving the real line number. For instance, I might very well want to put a breakpoint on the assignment that you've hoisted out of the if loop. By giving it line 0 you make that impossible. I'm not sure the help you would give by not making people deal with the fact that the compiler decided to run that initialization regardless of whether the surrounding test was true or not is worth the downside of preventing the debugger from meaningfully breaking on it.

But I think we're getting a little far afield...

dwblaikie commented 4 years ago

Yeah, I couldn't describe Clang's is_stmt guarantees off-hand, which is probably enough evidence it's not a /great/ platform right now. I thought LLVM generally put is_stmt whenever switching lines, but evidently not, given this case. The IR itself doesn't carry any stmt flag on it, so the is_stmt flag is purely constructed in the backend during DWARF generation, not based on some higher semantic notion the front end might have.

(though even in the presence of is_stmt, we'd still have a lot of line 0 - for profile fidelity, for instance & generally not "lying" to the user (even in the absence of is_stmt, those non-stmt locations show up in backtraces, etc, and you wouldn't want to tell the user you're at a line 5 call to their function when the code never (logically) reached line 5 at all because the condition is false - just because you hoisted or commoned the call out of the if condition))

jimingham commented 4 years ago

As I understand it is_stmt is there to make stepping and breakpoint setting more natural, so it's proof is maybe more in the eating than any formal specification...

What lldb would like to do with is_stmt is:

1) only set breakpoints on line entries marked with is_stmt=1 2) only step from line entry to line entry so marked, stepping transparently over other line entries (even from different lines) when is_stmt = 0.

That would allow a lot fuller fidelity in the line table without having to resort to tricks like marking things line number 0 so we know not to show them to the debugger user, and make the stepping seem more natural. This would be particularly useful in optimized code.

But my impression was that is_stmt is not in general set with the rigor required for the debugger to actually rely on it. And IMO it is much better to stop too often than to pass over code you wanted to take a look at. So I've always worked the stepping algorithms pretty conservatively.

If clang is taking is_stmt more seriously than my probably outdated general impression, it might be worth adding an "obey is_stmt" mode to breakpoint and stepping. That wouldn't be super hard.

dwblaikie commented 4 years ago

Dave's comments match with what the line table says as we are stepping through the program. Execution definitely passes through line 7 on the way out of the function.

I didn't see a component for debug info generation. Are we using lldb or clang for that? This isn't a problem in lldb, it's just telling you what the line table tells us.

Arguably it's something lldb could improve - by not stopping at places that aren't marked "is_stmt" but that feature's sufficiently vague I wouldn't push especially strongly there.

I'd probably put a bug like this under the "clang" component for the clang side of the fix here. The other bug I was thinking of that Chandler filed I did find eventually - and has a more in depth discussion about the complications of where to assign instructions related to loop constructs: llvm/llvm-project#20238

jimingham commented 4 years ago

Dave's comments match with what the line table says as we are stepping through the program. Execution definitely passes through line 7 on the way out of the function.

I didn't see a component for debug info generation. Are we using lldb or clang for that? This isn't a problem in lldb, it's just telling you what the line table tells us.

dwblaikie commented 4 years ago

Hmm, I remember a bug @chandlerc filing ages ago about something like this, but I couldn't find it internally or externally.

GDB doesn't hit this because the location has is_stmt false.

But the general issue is that the tail of the loop (where the header jumps to when it's time to exit the loop) is attributed to the end of the loop body - if there were braces on the loop, that would be the "}", but since there aren't, it's the ";". That's confusing/not ideal & maybe we should just attribute it to the loop header in that case.

llvmbot commented 4 months ago

@llvm/issue-subscribers-debuginfo

Author: None (llvmbot)

| | | | --- | --- | | Bugzilla Link | [45603](https://llvm.org/bz45603) | | Version | 9.0 | | OS | Linux | | Reporter | LLVM Bugzilla Contributor | | CC | @dwblaikie,@JDevlieghere,@jimingham,@labath | ## Extended Description $ lldb -v lldb version 9.0.1 $ cat small.c int main() { int j = 0, d = 0; if (j>=0) for (; d; d++) return 0; return 0; } $ clang -g small.c; lldb ./a.out (lldb) target create "./a.out" Current executable set to './a.out' (x86_64). (lldb) b small.c:5 Breakpoint 1: where = a.out`main + 35 at small.c:5:5, address = 0x0000000000001143 (lldb) run Process 877280 launched: '/home/yibiao/cv-gcov/a.out' (x86_64) Process 877280 stopped * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000555555555143 a.out`main at small.c:5:5 2 { 3 int j = 0, d = 0; 4 if (j>=0) -> 5 for (; d; d++) 6 return 0; 7 return 0; 8 } (lldb) fr v (int) j = 0 (int) d = 0 (lldb) step Process 877280 stopped * thread #1, name = 'a.out', stop reason = step in frame #0: 0x000055555555515e a.out`main at small.c:6:14 3 int j = 0, d = 0; 4 if (j>=0) 5 for (; d; d++) -> 6 return 0; 7 return 0; 8 } (lldb) step Process 877280 stopped * thread #1, name = 'a.out', stop reason = step in frame #0: 0x0000555555555163 a.out`main at small.c:7:3 4 if (j>=0) 5 for (; d; d++) 6 return 0; -> 7 return 0; 8 } ***** lldb is wrongly stopped at line 6. Before that, d is 0. Thus, the 6th statement should be not executed.

llvm / llvm-project

lldb wrongly stopped at a return statement within a for loop #44948

Extended Description