knurling-rs / probe-run

Run embedded programs just like native ones
Apache License 2.0
646 stars 75 forks source link

augment backtraces with "stack depth" #361

Open japaric opened 1 year ago

japaric commented 1 year ago

the --measure-stack feature reports the maximum stack usage of an entire program. it would be useful to have a more fine-grained report of stack usage per function. there are static analysis tools like cargo-call-stack that provide that kind of information but they rely on a nightly Rust toolchain and break often. as a complement to that tool, probe-run could report the stack usage of each function -- or more precisely the "stack depth" -- when it computes / prints a backtrace with relatively low implementation effort.

more details on the idea from a conversation:


when probe-run does (host-side) unwinding of the target's call stack it's also indirectly reading the value the SP register had at the start of each call frame. this is because it had to read the value of the Link Register (LR) that was pushed onto the stack in the prologue of each called function (+) to figure out the call site (PC value) of the function is currently unwinding.

in theory, it would be possible to include this information in the backtrace so each backtrace frame includes the "stack depth".

you can glimpse at the SP information right now by looking at probe-run logs but it's going to non-obvious to figure out which frame the SP value refers to.

for example, given this program

#[cortex_m_rt::entry]
fn main() -> ! {
    a();
    loop {}
}

#[inline(never)]
fn a() {
    let mut bytes = [MaybeUninit::<u8>::uninit(); 4];
    // "use" the stack
    black_box(&mut bytes.as_mut_ptr());
    b();
}

#[inline(never)]
fn b() {
    let mut bytes = [MaybeUninit::<u8>::uninit(); 12];
    black_box(&mut bytes.as_mut_ptr());
    c();
}

#[inline(never)]
fn c() {
    let mut bytes = [MaybeUninit::<u8>::uninit(); 20];
    black_box(&mut bytes.as_mut_ptr());
    unsafe { asm!("BKPT") } // exit
}

fn black_box<T>(x: &mut T) {
    unsafe { asm!("// {0}", in(reg) x as *mut T as usize) }
}

the probe-run logs and backtrace look like this:

$ probe-run $other_flags --backtrace=always -v my-program
(HOST) TRACE update reg=Register(14), rule=Offset(-4), abs=0x2003fb80 -> value=0x00000485
(HOST) TRACE update reg=Register(14), rule=Offset(-4), abs=0x2003fb98 -> value=0x00000471
(HOST) TRACE update reg=Register(14), rule=Offset(-4), abs=0x2003fba8 -> value=0x000004ad
(HOST) TRACE update reg=Register(14), rule=Offset(-4), abs=0x2003fbb0 -> value=0x000004a3
(HOST) TRACE update reg=Register(14), rule=Offset(-4), abs=0x2003fbb8 -> value=0x00000445
(HOST) TRACE update reg=Register(14), rule=Offset(0), abs=0x2003fbbc -> value=0xffffffff
stack backtrace:
   0: 0x00000494 @ cg::c
        at src/bin/cg.rs:33:14
   1: 0x00000484 @ cg::b
        at src/bin/cg.rs:27:2
   2: 0x00000470 @ cg::a
        at src/bin/cg.rs:20:2
   3: 0x000004ac @ cg::__cortex_m_rt_main
        at src/bin/cg.rs:11:5
   4: 0x000004a2 @ main
        at src/bin/cg.rs:8:1
   5: 0x00000444 @ Reset
(HOST) INFO  device halted without error
└─ probe_run::backtrace @ src/backtrace/mod.rs:108

register # 14 is the Linker Register (LR) I mentioned. abs= in the logs is the stack address from which LR was read; that's actually the SP value of each frame in the backtrace (++)

so roughly:

this (the per-frame "changes") matches the stack usage values reported by cargo-call-stack:

cg

not sure how much this SP in backtrace would help you but it doesn't require instrumenting the program so it's both runtime cost free and requires minimal changes to the program (introduce an artificial panic to print the backtrace at the desired spot)

(+) most functions start like this:

00000460 <cg::a::h1368587aa2735475>:
     460: 80 b5         push    {r7, lr}

which in pseudo-rust is roughly *sp = lr where sp: *mut u32 and lr: u32

(++) well, not always; exceptions (interrupts) need special handling but it's still possible to determine some SP value in that case, I think. also, the backtrace sometime includes inlined functions; there's no actual function call there so the SP stays the same in those frames.

(+++) this missing value can be obtained by reading the value of the SP when unwinding starts. I think the unwinding algorithm already does this but I don't think the value is printed in the logs.


TBD: the format in which to print the stack depth. one option is to include the SP value in each backtrace frame, like this:

stack backtrace:
   0: 0x00000494 @ cg::c (SP=0x2003fb60)
        at src/bin/cg.rs:33:14
   1: 0x00000484 @ cg::b (SP=0x2003fb80)
        at src/bin/cg.rs:27:2
   2: 0x00000470 @ cg::a (SP=0x2003fb98)
        at src/bin/cg.rs:20:2
   3: 0x000004ac @ cg::__cortex_m_rt_main (SP=0x2003fba8)
        at src/bin/cg.rs:11:5
   4: 0x000004a2 @ main (SP=0x2003fbb0)
        at src/bin/cg.rs:8:1
   5: 0x00000444 @ Reset (SP=0x2003fbb8)