knurling-rs / probe-run

Run embedded programs just like native ones
Apache License 2.0
646 stars 75 forks source link

backtrace can infinte-loop. #127

Closed Dirbaio closed 3 years ago

Dirbaio commented 3 years ago

I'm seeing this behavior with -C force-frame-pointers=no.

I think it's to be expected that backtracing doesn't work correctly with it, but I think at least this should be detected and fail with error: the stack appears to be corrupted beyond this point instead of looping forever.

If there's interest I can try cooking a binary that reproduces this.

stack backtrace:
   0: HardFaultTrampoline
      <exception entry>
   1: tester_gwc::sys::__cortex_m_rt_WDT
        at ak/src/bin/../
   2: WDT
        at ak/src/bin/../
      <exception entry>
   3: <futures_util::future::select::Select<A,B> as core::future::future::Future>::poll
        at /home/dirbaio/.cargo/registry/src/
   4: tester_gwc::common::abort_on_keypress::{{closure}}
        at ak/src/bin/../
   5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
        at /home/dirbaio/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/
   6: tester_gwc::test_network::{{closure}}
        at ak/src/bin/
   7: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
        at /home/dirbaio/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/
   8: tester_gwc::main::{{closure}}
        at ak/src/bin/
   9: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
  10: tester_gwc::sys::main_task::task::{{closure}}
        at ak/src/bin/../
  11: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
        at /home/dirbaio/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/
  12: embassy::executor::Task<F>::poll
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
  13: core::cell::Cell<T>::get
        at /home/dirbaio/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/
  14: embassy::executor::timer_queue::TimerQueue::update
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
  15: embassy::executor::Executor::run::{{closure}}
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
  16: embassy::executor::run_queue::RunQueue::dequeue_all
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
  17: embassy::executor::Executor::run
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
  18: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  19: real_main
        at ak/src/bin/../
  20: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  21: real_main
        at ak/src/bin/../
  22: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  23: real_main
        at ak/src/bin/../
  24: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  25: real_main
        at ak/src/bin/../
  26: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  27: real_main
        at ak/src/bin/../
  28: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  29: real_main
        at ak/src/bin/../
  30: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  31: real_main
        at ak/src/bin/../
  32: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  33: real_main
        at ak/src/bin/../
  34: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  35: real_main
        at ak/src/bin/../
  36: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  37: real_main
        at ak/src/bin/../
  38: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  39: real_main
        at ak/src/bin/../
  40: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  41: real_main
        at ak/src/bin/../
  42: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  43: real_main
        at ak/src/bin/../
  44: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  45: real_main
        at ak/src/bin/../
  46: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  47: real_main
        at ak/src/bin/../
... this goes on forever
japaric commented 3 years ago

I would have expected infinite loops like these to be caught by this check. Would be interesting to log LR, PC and other registers as probe-run unwinds the stack.

I'm seeing this behavior with -C force-frame-pointers=no.

did you compile the whole Rust code with -C force-frame-pointers=no or is it just the assembly ( what was compiled w/o frame pointers? The backtrace looks fine for the first 17 frames or so.

Urhengulas commented 3 years ago

Hi @Dirbaio, Could you please provide us with some code to reproduce this error? Thanks in advance! πŸ˜„

Dirbaio commented 3 years ago

Attached ELF: repro.tar.gz Built from this commit

[dirbaio@mars embassy-nrf-examples]$ probe-run --version
supported defmt version: 0.2
[dirbaio@mars embassy-nrf-examples]$ rustc --version
rustc 1.52.0-nightly (a15f484b9 2021-02-22)
[dirbaio@mars embassy-nrf-examples]$ cargo build --bin rtc_async
    Finished dev [optimized + debuginfo] target(s) in 0.05s
rustc 1.52.0-nightly (a15f484b9 2021-02-22)
[dirbaio@mars embassy-nrf-examples]$ probe-run --chip nRF52840_xxAA ../target/thumbv7em-none-eabi/debug/rtc_async
  (HOST) INFO  flashing program (11.52 KiB)
  (HOST) INFO  success!
       0 INFO  Hello World!
└─ rtc_async::__cortex_m_rt_main @ src/bin/
       1 INFO  tick
└─ rtc_async::run2::task::{{closure}} @ src/bin/
└─ rtc_async::run1::task::{{closure}} @ src/bin/
       3 INFO  tick
└─ rtc_async::run2::task::{{closure}} @ src/bin/
       4 INFO  tick
└─ rtc_async::run2::task::{{closure}} @ src/bin/
       5 INFO  tick
└─ rtc_async::run2::task::{{closure}} @ src/bin/
       6 INFO  tick
└─ rtc_async::run2::task::{{closure}} @ src/bin/
stack backtrace:
   0: HardFaultTrampoline
      <exception entry>
   1: rtc_async::run1::task::{{closure}}
        at src/bin/
   2: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
        at /home/dirbaio/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/
   3: embassy::executor::Task<F>::poll
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
   4: embassy::executor::timer_queue::TimerQueue::update
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
   5: embassy::executor::raw::Executor::run_queued::{{closure}}
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
   6: embassy::executor::run_queue::RunQueue::dequeue_all
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
   7: embassy::executor::raw::Executor::run_queued
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
   8: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
   9: embassy::executor::Executor::run
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
  10: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  11: embassy::executor::Executor::run
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
  12: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  13: embassy::executor::Executor::run
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
  14: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
  15: embassy::executor::Executor::run
        at /home/dirbaio/akiles/embassy/embassy/src/executor/
  16: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/
.... infinite loop
Urhengulas commented 3 years ago

Thank you @Dirbaio for providing the reproducable!

Jonas mentioned in #163 that following condition is likely the cause for this bug:

Lotterleben commented 3 years ago

reporting back for the record and also as a note for myself for when I pick this back up– the problem is indeed that stack_corrupted doesan't paint the full picture, but this one is tricky to detect – after 9:, the program counter doesn't change (checked by lr & !THUMB_BIT == pc & !THUMB_BIT), but the CFA (Canonical Frame Address) keep shifting, as the HardFault occurs in a loop . We can't just make the && a || because in other cases like for example.... several recursive function calls, this is a perfectly legal state that should not interrupt the backtrace printing. We've checked if gdb does some detection magic, but it seems to run into the same problem:

Screenshot 2021-03-30 at 16 27 33

(paging prevents the endless scrolling here though, which is nice)

As a band-aid, I'll implement a backtrace line limit after which the backtrace is cut off (length reconfigurable by flag) to prevent it from scrolling forever. If we're feeling fancy we could additionally cut out the middle and only print, say, the top 10 and bottom 10 lines separated by a [...] by default for better glance-ability.

Lotterleben commented 3 years ago

Closing this now due to #179 being merged– @Dirbaio , feel free to re-open if you disagree or have new input