lldb cannot single step over an Armv8.0-a atomic sequence

DavidSpickett commented 1 year ago

Compiling the following source:

#include <atomic>

int main() {
    std::atomic<int> i(0);
    return ++i;
}

With:

$ g++ /tmp/test.cpp -o /tmp/test.o -g -march=armv8-a -O3

Produces an atomic sequence to do the ++i. This sequence looks like:

 704:   885ffc20        ldaxr   w0, [x1]
 708:   11000400        add     w0, w0, #0x1
 70c:   8803fc20        stlxr   w3, w0, [x1]
 710:   35ffffa3        cbnz    w3, 704 <main+0x24>

As it happens, if you breakpoint main here with a software break, it will be on the first ldaxr. What's supposed to happen is when you load some value, do the add, then attempt to store it to memory. If anything has changed memory (in general, I think, not sure how specific it gets) the store will return a failed flag.

That failure is caught by the cbnz and you loop again to make another attempt.

What actually happens when you've breakpointed one of the atomic instructions is that whatever memory modification flag gets reset each time, so you never leave the sequence.

Output from lldb:

$ ./bin/lldb /tmp/test.o
(lldb) target create "/tmp/test.o"
Current executable set to '/tmp/test.o' (aarch64).
(lldb) b main
Breakpoint 1: where = test.o`main + 36 [inlined] std::__atomic_base<int>::operator++() at atomic_base.h:319:34, address = 0x0000000000000704
(lldb) c
error: Command requires a current process.
(lldb) run
Process 3673703 launched: '/tmp/test.o' (aarch64)
Process 3673703 stopped
* thread #1, name = 'test.o', stop reason = breakpoint 1.1
    frame #0: 0x0000aaaaaaaaa704 test.o`main [inlined] std::__atomic_base<int>::operator++(this=0x0000fffffffff2a0) at 
<...>
(lldb) c
Process 3673703 resuming
Process 3673703 stopped
* thread #1, name = 'test.o', stop reason = breakpoint 1.1
    frame #0: 0x0000aaaaaaaaa704 test.o`main [inlined] std::__atomic_base<int>::operator++(this=0x0000fffffffff2a0) at
<...>
(lldb) c
Process 3673703 resuming
Process 3673703 stopped
* thread #1, name = 'test.o', stop reason = breakpoint 1.1
    frame #0: 0x0000aaaaaaaaa704 test.o`main [inlined] std::__atomic_base<int>::operator++(this=0x0000fffffffff2a0) at
<...>
(lldb) dis
test.o`main:
<...>
->  0xaaaaaaaaa704 <+36>: ldaxr  w0, [x1]

This will go on forever as the atomic store never succeeds.

GDB is able to stop here and continue normally, it is my understanding that it is detecting these sequences.

llvmbot commented 1 year ago

@llvm/issue-subscribers-lldb

DavidSpickett commented 1 year ago

Dynamrio's docs also have a good breakdown of this: https://dynamorio.org/page_ldstex.html#autotoc_md192

If you compile for armv8.1-a you instead get a single instruction atomic which lldb can step over normally.

    0xaaaaaaaaa704 <+36>: mov    x3, #0x0
->  0xaaaaaaaaa708 <+40>: ldaddal w0, w0, [x1]
    0xaaaaaaaaa70c <+44>: ldr    x1, [sp, #0x18]

https://developer.arm.com/documentation/dui0801/g/A64-Data-Transfer-Instructions/LDADDA--LDADDAL--LDADD--LDADDL--LDADDAL--LDADD--LDADDL ldaddal = atomic add on word or double word in memory

I don't have plans to fix this, filing this mostly to document the issue.

Your workarounds are:

Use v8.1 atomics, if you have capable hardware.
Avoid breaking on/single stepping through atomics (source level step will be ok a lot of the time I think)

For future reference, this was tested on lldb compiled from 728d817becaa53cc9db1a440becc5c269a55ace2.

llvmbot commented 1 year ago

@llvm/issue-subscribers-backend-aarch64

jimingham commented 1 year ago

stepi has always been a bit of a problem. PowerPC used to have atomic instruction pairs that would refuse to execute if the processor is in single step mode... That ends up being less of a problem IRL because these instructions aren't branches, so lldb's stepping will generally pass over them by setting a breakpoint on the next branch and continuing over them. In general, it's a good idea to minimize the use of hardware single stepping, since that's always a funny mode in the processor.

In your case, though, the "continue" is what is causing the problem. The only things we do on continue are replace the trap with the instruction, single step, then continue. If single stepping was the problem, then we could just detect these instructions, and since we know they aren't branches, be can just break on the next instruction/continue rather than single step. If replacing the trap were the problem, we would have to implement out of place execution of instructions, since we don't have another way to stop the process (short of hardware breakpoints, but we try not to rely on them).

Jim

On Jan 24, 2023, at 7:53 AM, llvmbot @.***> wrote:

@llvm/issue-subscribers-backend-aarch64 https://github.com/orgs/llvm/teams/issue-subscribers-backend-aarch64 — Reply to this email directly, view it on GitHub https://github.com/llvm/llvm-project/issues/60259#issuecomment-1402177565, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADUPVW4RZMBWPPOX7BG64JLWT73GBANCNFSM6AAAAAAUFFGVRI. You are receiving this because you are on a team that was mentioned.

DavidSpickett commented 1 year ago

If single stepping was the problem, then we could just detect these instructions, and since we know they aren't branches, be can just break on the next instruction/continue rather than single step. If replacing the trap were the problem, we would have to implement out of place execution of instructions, since we don't have another way to stop the process (short of hardware breakpoints, but we try not to rely on them).

Yes, since every single step fails the sequence. We could treat them as a block (which is what the later single instruction atomics basically do).

Also occurred to me that watchpoints are also an issue but maybe it's just one extra step to handle it. If you can go from watchpoint to instruction that caused the hit, you can handle it from there the same as a code break at the same location.

Handling roughly being:

Check if in a sequence (by code reading?)
If you are, disable the breakpoint that was hit.
Set PC back to the start and break the end of the sequence.
Run until it finishes, hope that you got all the exit points of the sequence (yes, you can have many).
Enable the break again.

Though reporting to the user what the state was during a failed sequence probably doesn't have much value. Maybe we refuse to place the break, or warn that it will be skipped automatically.

Watchpoints you'd just have to handle as they come. Maybe warn that this was hit during a sequence and as such, the values you see now might not be what is used when we re-run it.

DavidSpickett commented 8 months ago

jimingham commented 8 months ago

If we can detect sequences, we can simplify matters by always shifting the breakpoint to the beginning of the sequence, that way we wouldn't have to deal with somehow restarting the sequence. Presumably you can't jump into the middle of one of these atomic sequences? We move breakpoints around for other reasons (skipping prologues, dealing with source lines with no code, etc) so there's precedent for doing that.

The branch prediction seems tricky. It's not terribly hard to do "I am at a branch instruction, I wonder where it will go?" but it's harder to do "I wonder where an instruction 5 instructions downstream will go". But if the exit instruction isn't the first one of a sequence that's what we'd have to do. I don't think we do that anywhere else in lldb.

llvm / llvm-project

lldb cannot single step over an Armv8.0-a atomic sequence #60259