knurling-rs / probe-run

Run embedded programs just like native ones
Apache License 2.0
645 stars 75 forks source link

Running app on STM32F4 crashes core with unrecoverable exception #420

Closed ThadHouse closed 11 months ago

ThadHouse commented 11 months ago

Describe the bug When attempting to use probe-run to deploy an app to an STM32F4, the deploy causes the core to lock up.

To Reproduce Clone https://github.com/ThadHouse/Stm32F4RustCrash cargo run to an STM32F4

Expected and observed behavior Expect process to be started with "Hello, world!" printed from RTT. However, command eventually finishes with

Error: An ARM specific error occurred.

Caused by:
    Timeout occurred during operation.

Looking up in the log, when attempting to start the core, the following error is spit out for 5 seconds, which looks to be the timeout for setting up the core.

(HOST) ERROR The core is in locked up status as a result of an unrecoverable exception
ΓööΓöÇ probe_rs::architecture::arm::core::armv7m @ C:\Users\thadh\.cargo\registry\src\index.crates.io-6f17d22bba15001f\probe-rs-0.20.0\src\architecture\arm\core\armv7m.rs:674

The whole log is in the repo at err.txt.

Probe details

Operating System: Seen on both Windows and macOS.

ELF file (attachment) stmapp.txt

(Replace .txt with .elf)

ThadHouse commented 11 months ago

Doing some debugging, it seems like its something when flashing causing this. If I do the run listed above,

If I dump the result of analyze_vector_table, I get Stack: 0xe00abe00 Reset: 0x62d780d, which are completely wrong. If I then run again with --no-flash, I get Stack: 0x2001fbc0 Reset: 0x80001a9 which is correct. Something happening during flash is completely bringing down the debug interface in a way reset_and_halt can't get out of.

burrbull commented 11 months ago

Could you try flash with another probe? Not J-Link

ThadHouse commented 11 months ago

I don't have a non J-Link probe. But I do have one set to CMSIS-DAP mode, and using that also resulted in the same issue.

ThadHouse commented 11 months ago

I updated the post because its not actually related to RTT. The program without RTT does the same thing, however because of how probe-run is written that never actually results in an error, as its sitting there waiting forever for a halt.

Urhengulas commented 11 months ago

@ThadHouse Can you please try what happens if you replace probe-run with probe-rs run? The flags are almost the same.

You can install the probe-rs cli with:

$ cargo install probe-rs --features cli

And then change your .cargo/config.toml like this:

[target.'cfg(all(target_arch = "arm", target_os = "none"))']
-runner = "probe-run --chip nRF52840_xxAA"
+runner = "probe-rs run --chip nRF52840_xxAA"
ThadHouse commented 11 months ago

probe-rs run results in the behavior as if it’s running without rtt. The core doesn’t actually start, and the logs show the locked up error message

    Finished release [optimized] target(s) in 0.04s
     Running `probe-rs run --chip STM32F405RGTx target\thumbv7em-none-eabihf\release\stmapp`
     Erasing sectors ✔ [00:00:00] [######################################################################################################################################################################################################################################################################################################################################] 16.00 KiB/16.00 KiB @ 37.51 KiB/s (eta 0s )
 Programming pages   ✔ [00:00:00] [########################################################################################################################################################################################################################################################################################################################################] 3.00 KiB/3.00 KiB @ 12.95 KiB/s (eta 0s )
    Finished in 0.689s
ERROR probe_rs::architecture::arm::core::armv7m: The core is in locked up status as a result of an unrecoverable exception
ERROR probe_rs::architecture::arm::core::armv7m: The core is in locked up status as a result of an unrecoverable exception
ERROR probe_rs::architecture::arm::core::armv7m: The core is in locked up status as a result of an unrecoverable exception
ERROR probe_rs::architecture::arm::core::armv7m: The core is in locked up status as a result of an unrecoverable exception
ERROR probe_rs::architecture::arm::core::armv7m: The core is in locked up status as a result of an unrecoverable exception
ERROR probe_rs::architecture::arm::core::armv7m: The core is in locked up status as a result of an unrecoverable exception
ThadHouse commented 11 months ago

So some more debugging info, which I got by running a modified probe-run.

Before flash(), I can reset_and_halt the core as often as I'd like. The SP and PC registers get set to the expected values. However after flash() reset_and_halt results in the locked up behavior, with SP and PC being wildly incorrect. Whats interesting is when SP and PC are wrong, they're the same values every time I try to run probe-run. They don't change with each iteration. If I grab SP and PC after flash() but before core.reset_and_halt(), SP and PC point to the root of RAM, which makes sense because the flasher app is still ready to run at that point.

Is there an easier way to find the flasher app that gets deployed, other then to just manually disassemble the yaml. I'm surprised the raw assembly isn't stored somewhere.

I'm trying to dig through errata and see if theres something for the F405 that results in a bad reset when running code from ram. I'd be surprised thouh.

ThadHouse commented 11 months ago

Turns out this is our fault. Whoever designed the boards left Boot0 floating, and for some reason programming the flash does something that sets Boot0 high, which causes the reset to jump to ram. First 2 words in ram are 0xe00abe00 and 0x62d780d, which are SP and PC after boot.