knurling-rs / probe-run

Run embedded programs just like native ones
Apache License 2.0
646 stars 75 forks source link

Undefined behaviour if probe-run initializes RTT before target #351

Open sakian opened 1 year ago

sakian commented 1 year ago

Describe the bug The target writes the control block to memory at init (defmt-rtt does this automatically, rtt-target requires calling an init macro). probe-run will load this control block from memory after flashing and starting the application.

From what I can see, the only thing stopping probe-run from reading this control block before the target writes it is a read for the RTT ID at the top of the memory space (written last by the target), however, if there is a control block written to memory from a previous run, probe-run might read this old control block before or while the target writes the new control block.

This can cause unexpected behaviour like corrupted control block message during init or when trying to read from the RTT up channel, or outputting corrupted data (only seen when using defmt). Some examples are below.

    Finished dev [unoptimized + debuginfo] target(s) in 0.09s
     Running `probe-run --chip ATSAMDA1G16B target\thumbv6m-none-eabi\debug\atsamda1_test`
(HOST) INFO  flashing program (293 pages / 18.31 KiB)
(HOST) INFO  success!
Error: Control block corrupted: Nonsensical array sizes at 20000000: max_up_channels=0 max_down_channels=15852
error: process didn't exit successfully: `probe-run --chip ATSAMDA1G16B target\thumbv6m-none-eabi\debug\atsamda1_test` (exit code: 1)
    Finished dev [unoptimized + debuginfo] target(s) in 0.09s
     Running `probe-run --chip ATSAMDA1G16B target\thumbv6m-none-eabi\debug\atsamda1_test`
(HOST) INFO  flashing program (293 pages / 18.31 KiB)
(HOST) INFO  success!
────────────────────────────────────────────────────────────────────────────────
RTT error: Control block corrupted: write pointer is 4 while buffer size is 1 for "up" channel 0 ()
────────────────────────────────────────────────────────────────────────────────
Error: An error with the usage of the probe occurred

Caused by:
    Operation timed out
error: process didn't exit successfully: `probe-run --chip ATSAMDA1G16B target\thumbv6m-none-eabi\debug\atsamda1_test` (exit code: 1)
(HOST) DEBUG Programmed page of size 64 bytes in 25 ms
└─ probe_run @ src\main.rs:114
(HOST) INFO  success!
└─ probe_run @ src\main.rs:132
(HOST) DEBUG 7108 bytes of stack available (0x20000438 ..= 0x20001FFC), using 712 byte canary
└─ probe_run::canary @ src\canary.rs:87
(HOST) TRACE setting up canary took 0.030s (23.28 KiB/s)
└─ probe_run::canary @ src\canary.rs:101
(HOST) DEBUG starting device
└─ probe_run @ src\main.rs:190
(HOST) DEBUG Successfully attached RTT
└─ probe_run @ src\main.rs:412
────────────────────────────────────────────────────────────────────────────────
♥☺☺j♥☺☻j♥☺♥j♥☺♦j♥☺♣j♥☺♠j♥☺j♥j♥☺ j♥☺
j♥☺
j♥☺
j♥☺j♥☺j♥☺►j♥☺◄j♥☺↕j♥☺‼j♥☺¶j♥☺§j♥☺▬j♥☺↨j♥☺↑j

To Reproduce Steps to reproduce the behaviour:

  1. Clone this test repo: https://github.com/sakian/atsamda1_test
  2. Connect to a samd21 or samda1 chip through j-link
  3. run cargo run

Expected and observed behaviour Should output Started after flashing

config.toml The contents of your project's .cargo/config.toml file

runner = "probe-run --chip ATSAMD21J16B" # Almost identical to the ATSAMDA1G16B
#runner = "probe-run --chip ATSAMDA1G16B --verbose"
rustflags = [
  "-C",
  "link-arg=-Tlink.x",
  # This is needed if your flash or ram addresses are not aligned to 0x10000 in memory.x
  # See https://github.com/rust-embedded/cortex-m-quickstart/pull/95
  "-C",
  "link-arg=--nmagic",
]

[build]
target = "thumbv6m-none-eabi"

# [env]
# DEFMT_LOG = "info"
# DEFMT_RTT_BUFFER_SIZE = "64"

Probe details

[0]: J-Link (J-Link) (VID: 1366, PID: 0101, Serial: 000821011390, JLink)

Operating System: Windows 10

Urhengulas commented 1 year ago

Thank you for reporting. Do you have any solutions in mind. Would it help if we reset the RTT ID before running?

sakian commented 1 year ago

I haven't had much time to look at this recently. I've added a delay to probe-run which has significantly reduced how often this happens, but is obviously not a great solution. I'm hopeful that I can return to this issue at some point and give it more thought.