rust-lang / libs-team

The home of the library team
Apache License 2.0
127 stars 19 forks source link

ACP: `core::arch::breakpoint` #491

Open joshtriplett opened 2 days ago

joshtriplett commented 2 days ago

Proposal

Problem statement

Sometimes, for debugging, users want to have a software breakpoint instruction to use with their debugger, or to generate a core dump for subsequent analysis.

core::intrinsics::breakpoint() exists, but intrinsics are perma-unstable.

Users can manually emit a breakpoint instruction using inline assembly, such as core::arch::asm!("int3") on x86, or core::arch::asm!("brk #0xf000") on ARM. However, this isn't portable.

Solution sketch

In core::arch:

/// Compiles to a target-specific software breakpoint instruction.
///
/// This will typically abort the program. It may result in a core dump. Additional
/// target-specific capabilities may be possible depending on debuggers or other tooling.
#[inline(always)]
pub fn breakpoint() {
    core::intrinsics::breakpoint();
}

Note that this should not be noreturn (-> !), because on some targets and environments, the user may be able to continue execution from the breakpoint in a debugger.

Links and related work

The unbug crate provides macros that emit breakpoints (e.g. for assertions), but it depends on nightly Rust.

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

Second, if there's a concrete solution:

programmerjake commented 1 day ago

what happens if an ISA doesn't have a breakpoint instruction? (e.g. wasm iirc)

Amanieu commented 1 day ago

I'm concerned about portability: this works on x86 because, after handling an int3, the program counter will end up pointing to the instruction after the breakpoint. However this is not the case for other architectures, for example an AArch64 BRK will keep the program counter pointing at the BRK after it is handled, and you have to manually skip the instruction.

joshtriplett commented 1 day ago

@Amanieu wrote:

I'm concerned about portability: this works on x86 because, after handling an int3, the program counter will end up pointing to the instruction after the breakpoint. However this is not the case for other architectures, for example an AArch64 BRK will keep the program counter pointing at the BRK after it is handled, and you have to manually skip the instruction.

This is entirely the problem of the debugger to deal with; it doesn't make the Rust program non-portable. (Also, I've recently learned that one standard workaround for that in x86 is to use int3; nop.)

joshtriplett commented 1 day ago

@programmerjake wrote:

what happens if an ISA doesn't have a breakpoint instruction? (e.g. wasm iirc)

WebAssembly appears to implement the LLVM llvm.debugtrap intrinsic, and maps it to the instruction unreachable.

More generally: I would expect that there's always some way to trap, or failing that to abort, and that worst case it'll map to whatever unreachable! or assert! uses to bail out. If an architecture truly didn't have anything for that, the Rust target for it would have bigger problems.

Amanieu commented 1 day ago

@joshtriplett I think you misunderstand, if you try to continue after a breakpoint instruction:

Given that this only produces the expected behavior on x86, I don't think we can reasonably expose this as a platform-independent intrinsic.

joshtriplett commented 1 day ago

You have to manually modify the PC in the debugging to skip past the instruction.

I understood that; my point is, dealing with that kind of variation is the job of a debugger. Some debuggers do recognize, for instance, the specific aarch64 brk produced by LLVM's __builtin_debugtrap() and automatically skip over it when hit.

Given that this only produces the expected behavior on x86, I don't think we can reasonably expose this as a platform-independent intrinsic.

LLVM and C++ both have a platform-independent intrinsic for this.

The platform-independent behavior is "this will trap, stopping execution; it may result in a core dump; a debugger may treat this as a breakpoint".

BrainBacon commented 1 day ago

From my experiments in the Unbug crate running in VSCode I've noticed that brk #1 is insufficient on Apple silicon. That resulted in getting stuck on the breakpoint. However, it looks like __builtin_debugtrap() in LLVM uses brk #0xF000 which will allow the debugger to continue. I've also noticed that a similar nop trick was necessary to get the debugger to land on the correct statement, in my case brk #0xF000 \n nop. The newline (not a semicolon) was necessary using core::arch::asm!.

programmerjake commented 1 day ago

@programmerjake wrote:

what happens if an ISA doesn't have a breakpoint instruction? (e.g. wasm iirc)

WebAssembly appears to implement the LLVM llvm.debugtrap intrinsic, and maps it to the instruction unreachable.

ok, I had assumed unreachable wasn't usable since I don't expect a debugger to be able to continue after hitting it...

More generally: I would expect that there's always some way to trap, or failing that to abort,

Yeah I assumed the desired semantics were that a debugger would always be able to continue after hitting the breakpoint, however lowering it to an abort-like thing means that isn't really possible.

joshtriplett commented 1 day ago

Yeah I assumed the desired semantics were that a debugger would always be able to continue after hitting the breakpoint, however lowering it to an abort-like thing means that isn't really possible.

The desired semantics are that it traps, in a target-specific way, which might dump core and might be possible to continue if you have a debugger attached, but the details will be target-specific and debugger-specific.

Amanieu commented 8 hours ago

The desired semantics are that it traps, in a target-specific way, which might dump core and might be possible to continue if you have a debugger attached, but the details will be target-specific and debugger-specific.

You can achieve this in a portable way (at least UNIX) and which works with resuming execution in the debugger by calling raise(SIGTRAP).