knurling-rs / probe-run

Run embedded programs just like native ones
Apache License 2.0
646 stars 75 forks source link

backtrace can infinte-loop. #127

Closed Dirbaio closed 3 years ago

Dirbaio commented 3 years ago

I'm seeing this behavior with -C force-frame-pointers=no.

I think it's to be expected that backtracing doesn't work correctly with it, but I think at least this should be detected and fail with error: the stack appears to be corrupted beyond this point instead of looping forever.

If there's interest I can try cooking a binary that reproduces this.

stack backtrace:
   0: HardFaultTrampoline
      <exception entry>
   1: tester_gwc::sys::__cortex_m_rt_WDT
        at ak/src/bin/../sys.rs:556
   2: WDT
        at ak/src/bin/../sys.rs:553
      <exception entry>
   3: <futures_util::future::select::Select<A,B> as core::future::future::Future>::poll
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.8/src/future/select.rs:95
   4: tester_gwc::common::abort_on_keypress::{{closure}}
        at ak/src/bin/../tester_common.rs:26
   5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
        at /home/dirbaio/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80
   6: tester_gwc::test_network::{{closure}}
        at ak/src/bin/tester_gwc.rs:61
   7: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
        at /home/dirbaio/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80
   8: tester_gwc::main::{{closure}}
        at ak/src/bin/tester_gwc.rs:43
   9: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
  10: tester_gwc::sys::main_task::task::{{closure}}
        at ak/src/bin/../sys.rs:196
  11: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
        at /home/dirbaio/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80
  12: embassy::executor::Task<F>::poll
        at /home/dirbaio/akiles/embassy/embassy/src/executor/mod.rs:132
  13: core::cell::Cell<T>::get
        at /home/dirbaio/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/cell.rs:432
  14: embassy::executor::timer_queue::TimerQueue::update
        at /home/dirbaio/akiles/embassy/embassy/src/executor/timer_queue.rs:34
  15: embassy::executor::Executor::run::{{closure}}
        at /home/dirbaio/akiles/embassy/embassy/src/executor/mod.rs:241
  16: embassy::executor::run_queue::RunQueue::dequeue_all
        at /home/dirbaio/akiles/embassy/embassy/src/executor/run_queue.rs:65
  17: embassy::executor::Executor::run
        at /home/dirbaio/akiles/embassy/embassy/src/executor/mod.rs:223
  18: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  19: real_main
        at ak/src/bin/../sys.rs:478
  20: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  21: real_main
        at ak/src/bin/../sys.rs:478
  22: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  23: real_main
        at ak/src/bin/../sys.rs:478
  24: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  25: real_main
        at ak/src/bin/../sys.rs:478
  26: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  27: real_main
        at ak/src/bin/../sys.rs:478
  28: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  29: real_main
        at ak/src/bin/../sys.rs:478
  30: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  31: real_main
        at ak/src/bin/../sys.rs:478
  32: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  33: real_main
        at ak/src/bin/../sys.rs:478
  34: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  35: real_main
        at ak/src/bin/../sys.rs:478
  36: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  37: real_main
        at ak/src/bin/../sys.rs:478
  38: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  39: real_main
        at ak/src/bin/../sys.rs:478
  40: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  41: real_main
        at ak/src/bin/../sys.rs:478
  42: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  43: real_main
        at ak/src/bin/../sys.rs:478
  44: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  45: real_main
        at ak/src/bin/../sys.rs:478
  46: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.6.4/src/asm.rs:117
  47: real_main
        at ak/src/bin/../sys.rs:478
... this goes on forever
japaric commented 3 years ago

I would have expected infinite loops like these to be caught by this check. Would be interesting to log LR, PC and other registers as probe-run unwinds the stack.

I'm seeing this behavior with -C force-frame-pointers=no.

did you compile the whole Rust code with -C force-frame-pointers=no or is it just the assembly (asm.rs?) what was compiled w/o frame pointers? The backtrace looks fine for the first 17 frames or so.

Urhengulas commented 3 years ago

Hi @Dirbaio, Could you please provide us with some code to reproduce this error? Thanks in advance! πŸ˜„

Dirbaio commented 3 years ago

Attached ELF: repro.tar.gz Built from this commit https://github.com/akiles/embassy/commit/f8c434400393d74fb51acb130bbaaf835e7edf38

[dirbaio@mars embassy-nrf-examples]$ probe-run --version
0.2.1
supported defmt version: 0.2
[dirbaio@mars embassy-nrf-examples]$ rustc --version
rustc 1.52.0-nightly (a15f484b9 2021-02-22)
[dirbaio@mars embassy-nrf-examples]$ cargo build --bin rtc_async
    Finished dev [optimized + debuginfo] target(s) in 0.05s
rustc 1.52.0-nightly (a15f484b9 2021-02-22)
[dirbaio@mars embassy-nrf-examples]$ probe-run --chip nRF52840_xxAA ../target/thumbv7em-none-eabi/debug/rtc_async
  (HOST) INFO  flashing program (11.52 KiB)
  (HOST) INFO  success!
────────────────────────────────────────────────────────────────────────────────
       0 INFO  Hello World!
└─ rtc_async::__cortex_m_rt_main @ src/bin/rtc_async.rs:42
       1 INFO  tick
└─ rtc_async::run2::task::{{closure}} @ src/bin/rtc_async.rs:31
       2 INFO  BIG INFREQUENT TICK
└─ rtc_async::run1::task::{{closure}} @ src/bin/rtc_async.rs:22
       3 INFO  tick
└─ rtc_async::run2::task::{{closure}} @ src/bin/rtc_async.rs:31
       4 INFO  tick
└─ rtc_async::run2::task::{{closure}} @ src/bin/rtc_async.rs:31
       5 INFO  tick
└─ rtc_async::run2::task::{{closure}} @ src/bin/rtc_async.rs:31
       6 INFO  tick
└─ rtc_async::run2::task::{{closure}} @ src/bin/rtc_async.rs:31
stack backtrace:
   0: HardFaultTrampoline
      <exception entry>
   1: rtc_async::run1::task::{{closure}}
        at src/bin/rtc_async.rs:24
   2: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
        at /home/dirbaio/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80
   3: embassy::executor::Task<F>::poll
        at /home/dirbaio/akiles/embassy/embassy/src/executor/mod.rs:71
   4: embassy::executor::timer_queue::TimerQueue::update
        at /home/dirbaio/akiles/embassy/embassy/src/executor/timer_queue.rs:35
   5: embassy::executor::raw::Executor::run_queued::{{closure}}
        at /home/dirbaio/akiles/embassy/embassy/src/executor/raw.rs:135
   6: embassy::executor::run_queue::RunQueue::dequeue_all
        at /home/dirbaio/akiles/embassy/embassy/src/executor/run_queue.rs:66
   7: embassy::executor::raw::Executor::run_queued
        at /home/dirbaio/akiles/embassy/embassy/src/executor/raw.rs:117
   8: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.7.1/src/asm.rs:49
   9: embassy::executor::Executor::run
        at /home/dirbaio/akiles/embassy/embassy/src/executor/mod.rs:202
  10: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.7.1/src/asm.rs:49
  11: embassy::executor::Executor::run
        at /home/dirbaio/akiles/embassy/embassy/src/executor/mod.rs:202
  12: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.7.1/src/asm.rs:49
  13: embassy::executor::Executor::run
        at /home/dirbaio/akiles/embassy/embassy/src/executor/mod.rs:202
  14: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.7.1/src/asm.rs:49
  15: embassy::executor::Executor::run
        at /home/dirbaio/akiles/embassy/embassy/src/executor/mod.rs:202
  16: cortex_m::asm::wfe
        at /home/dirbaio/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-0.7.1/src/asm.rs:49
.... infinite loop
Urhengulas commented 3 years ago

Thank you @Dirbaio for providing the reproducable!


Jonas mentioned in #163 that following condition is likely the cause for this bug: https://github.com/knurling-rs/probe-run/blob/d7029443bc91860d316efc39d547a5313264cbdc/src/main.rs#L801

Lotterleben commented 3 years ago

reporting back for the record and also as a note for myself for when I pick this back up– the problem is indeed that stack_corrupted doesan't paint the full picture, but this one is tricky to detect – after 9:, the program counter doesn't change (checked by lr & !THUMB_BIT == pc & !THUMB_BIT), but the CFA (Canonical Frame Address) keep shifting, as the HardFault occurs in a loop https://github.com/embassy-rs/embassy/blob/f8c434400393d74fb51acb130bbaaf835e7edf38/embassy/src/executor/mod.rs#L202 . We can't just make the && a || because in other cases like for example.... several recursive function calls, this is a perfectly legal state that should not interrupt the backtrace printing. We've checked if gdb does some detection magic, but it seems to run into the same problem:

Screenshot 2021-03-30 at 16 27 33

(paging prevents the endless scrolling here though, which is nice)

As a band-aid, I'll implement a backtrace line limit after which the backtrace is cut off (length reconfigurable by flag) to prevent it from scrolling forever. If we're feeling fancy we could additionally cut out the middle and only print, say, the top 10 and bottom 10 lines separated by a [...] by default for better glance-ability.

Lotterleben commented 3 years ago

Closing this now due to #179 being merged– @Dirbaio , feel free to re-open if you disagree or have new input