blackmagic-debug / blackmagic

In application debugger for ARM Cortex microcontrollers.
GNU General Public License v3.0
3.18k stars 763 forks source link

Read memory while running #1532

Open pavel-kirienko opened 1 year ago

pavel-kirienko commented 1 year ago

Is it practically possible to support reading memory while the target is running in the background? This is highly useful for real-time state monitoring.

Currently, the client reports that this is not implemented:

(gdb) cont&
Continuing.
(gdb) x 0x20000010
0x20000010:     Cannot execute this command while the target is running.
Use the "interrupt" command to stop the target
and then try again.
dragonmux commented 1 year ago

This is a subject that we've been looking into off-and-on for a while now and unfortunately it's extremely complicated.

There are two parts at play here: Supporting the GDB remote protocol components required to properly and correctly support non-stop mode; Support for this from the targets themselves.

The latter is the complication - not all Cortex-M cores support this mode of operation, in fact quite a few of the supported targets categorically do not, and it requires specific versions of the CoreSight components. RISC-V cores almost universally do not either and nor do AVR cores, both of which are upcoming support.

If we are working with a core that does support this, then we are also left with no way to communicate to GDB that this is the case due to order of operations - we only get a chance to tell GDB that we support non-stop operation when you initiate the remote protocol communications connection, and we cannot alter the parameters of that after when you run a scan for your target, or when you attach (even though we can do things like tell GDB about the register layout of the target, and the memory maps, at that point).

What we're getting at is that it might be possible, but you're hitting on both a GDB protocol limitation and target limitations as to why we haven't tried changing things to attempt to support this for now.

ALTracer commented 1 year ago

real-time state monitoring

Do you want to access memory space while the target is running, like STM32CubeIDE Atollic TrueStudio "Live Expressions" with STLINK or Ozone live watch with JLink? This is possible even on Cortex-M0 with ADIv5.1 via its DAP, albeit the PE has priority over DAP on AHB accesses (including default AHB SRAM at 0x20000000). It doesn't halt the core. Bigger cores have MEM-AP. See DP SELECT register APSEL field description. Noting to myself I didn't test this with the plethora of CMSIS-DAP implementations, and the three mentioned adapter architectures require a corresponding gdbserver on PC, unlike Blackmagic Debug.

However, in BMP's built-in gdbserver only GDB all-stop mode is supported.

I can recommend instead using

Sorry for not answering the question but trying to solve a higher-level (X/Y) problem. Since you're here, would you please mention which target are you working with now? And maybe revision #970?

Reference to GDB all-stop https://mcuoneclipse.com/2019/01/13/gdb-all-stop-and-non-stop-mode-with-linkserver/ Reference to CubeIDE SWV https://www.codeinsideout.com/blog/stm32/swv/

pavel-kirienko commented 1 year ago

Denis, thank you for the suggestions. Yes, your understanding of my intentions is correct -- I want to monitor specific memory addresses in real-time without halting code execution. The alternatives are good to know of, but sadly none are applicable in my case, as I can't halt the core (not even briefly, as it interferes with a hard real-time process), and I need to observe the values rather than the fact of data access. I already have a certain intrusive solution similar to RTT, but it is difficult to work with for other reasons.

Thank you for reminding me about #970, I should look into that once I get back to the project where the issue was first detected.

ALTracer commented 1 year ago

I can't halt the core (not even briefly, as it interferes with a hard real-time process), and I need to observe the values rather than the fact of data access

Then allow me to continue on the off-topic.

  1. Of course, you could dedicate some hours (weeks?) to contribute to Blackmagic Debug a working solution for non-stop debugging for src/target/cortexm and src/target/cortexa, as these are the most likely target families to benefit from this, while keeping in mind the impending Atmel JTAG-PDI and RISC-V SDI (also WCH RVSWD) implementation support PRs. This would provide your native BMD probe a link (USB Full Speed 12MHz CDC-ACM is in the ballpark of 200KiB/s) to methodically poke target memory space without halting the target.

  2. You still didn't mention your target. If it's small LQFP64 (so ETM-less), then I'd set up DMA mem2mem from that variable into an RTT-like circular buffer, which gets dumped by other means: a second MCU (stlink/v3e class, stm32f723), even a BMP with RTT (50 kchars/s mentioned in UsingRTT.md). This is less intrusive but may load the AHB interconnect depending on exact target SRAM banks split.

  3. If it's a giant STM32H7 or STM32F4 in LQFP100/144/176 package, it should have ETM (Parallel, synchronous) Trace TRACED[0:3] pins available on GPIOE which (assuming PCB quality and cable comparable to MII 25MHz 4+1-pin Ethernet length-matched requirements gets 100Mbit/s) will throw a lot of binary traffic directly at a, say, ORBTrace Mini (FPGA with USB High-Speed 480MHz) for Orbuculum to decode. I think local devs might agree with plugging a different project here (for the different purpose -- tracing vs. interactive halting debugging).

What's the desired sample rate? 1MHz, 10M, 100M? A single uint32_t variable with value probability distribution of white noise (every bit may change on the next cycle) changing at 100MHz (examples like DWT_CYCCNT or the PC register have much less entropy) requires a 3200MHz bandwidth link, so please describe the requirements. With BMP you're not very likely to sample much faster than (rounding down) 10kHz. Recent measurements point at like 4MHz SWD at best, and this is SWCLK rate, not incoming payload data rate.

  1. I tried adamgreen/mri on a stm32f411ce board and it seems to provide an (internal to target firmware) monitor (non-stop) debug interface. So the critical interrupts will continue to be serviced by the rest of firmware, but you can halt/preempt the main thread (the major endless loop) by sending GDB RSP commands directly into a special UART and that invokes the MRI handler. This solution is very intrusive but (when integrated) works without SWJ-DP adapters at all.
dragonmux commented 1 year ago

@ALTracer It is not universal that you can read memory on all Cortex-M0 cores when they're running. That's part of the problem we face as we carefully outlined in our response. It depends on more factors and what the device is.. that is part of the problem.

pavel: a review of that issue and if it's still a problem would be fantastic, we suspect that with things like the ADIv5 halt correctness PR, that we have probably fixed that issue, but we couldn't prove it nor recreate the circumstances.

perigoso commented 1 year ago

I haven't looked into this in particular in the GDB protocol, but one very non ideal option might be to report the functionality always, but with targets that don't support it just don't report changes, like the processor wasn't doing anything, or doing so just when it is halted for some reason (I'm just throwing clay at the wall here, I'm not well contextualized)

koendv commented 10 months ago

It depends upon what is sufficient for your needs. You could

This would allow you to poll variable contents about 50 times per second, without halting the target. HTH

dragonmux commented 10 months ago

That all assumes, mind, a core that has support for that - not all the Cortex-M cores do and none of the Cortex-A or -R cores do by definition of how they read/write memory. Though, yes, those are options if that assumption holds for the user's core.

koendv commented 10 months ago

The question is always: do you have running code? This patch adds a "mon memwatch" command. I've tested this on black pill: make PROBE_HOST=blackpill-f411ce ENABLE_RTT=1

The argument to "mon memwatch" are memory addresses and formats (/d, /u, /x). You can watch up to 8 memory addresses at the same time. In gdb:

(gdb) p &counter
$1 = (<data variable, no debug info> *) 0x20000224 <counter>
(gdb) mon memwatch /d 0x20000224
0x20000224 
(gdb) r

When the watched variable changes, output is written to the serial port:

0x20000224: 0
0x20000224: 1
0x20000224: 2
0x20000224: 3
0x20000224: 4

To switch off, enter "mon memwatch" without arguments. With both target and bmp on a stm32f411, memory is polled 1000 times per second. Opinion? memwatch.txt (22 oct: command name changed to "memwatch", to avoid confusion with gdb "watch", as per suggestion below)

pavel-kirienko commented 10 months ago

@koendv I presume the watchpoint polling is a non-halting operation that does not interfere with code execution (aside from a possible contention on the bus matrix)?

koendv commented 10 months ago

@pavel-kirienko the watchpoint polling steals bus cycles; the effect is similar to dma.

koendv commented 10 months ago

I've made a small change to lower probe flash use. The patch now costs 424 bytes probe flash, 196 bytes probe ram.

dragonmux commented 10 months ago

We'd prefer to call that a "datawatch" or similar because GDB already has watchpoints and we do not want to confuse the user between the two.

GDB's are accessed using watch *<addr>, awatch *<addr>, and rwatch *<addr> then inspected using info watch. These halt the target when triggered and display which and what value changed (from + to) along with the triggering program address in GDB.

We're also very hesitant to duplicate that functionality needlessly, particularly as that uses hardware in the target to function and doesn't have the aforementioned "not all targets support unhalted memory accesses" problem. You can have GDB auto-resume the target when a watchpoint fires, making the interruption minimal.

sidprice commented 10 months ago

I would like to up-vote this PR, this is a feature I use a lot with other tools. Using the GDB watch feature can and sometimes does, disrupt the target MCU, which may be critical in some applications.

Just my 10-cents :)

pavel-kirienko commented 10 months ago

You can have GDB auto-resume the target when a watchpoint fires, making the interruption minimal.

@dragonmux This is not a viable substitute for the solution proposed by @koendv because there exists a large set of (hard) real-time applications (control systems mostly, I imagine), where even a 1 ms interruption makes the debugged system dysfunctional, and in practice the halt/read/resume sequence takes much more than 1 ms.

dragonmux commented 10 months ago

We understand that point, pavel, however - we cannot ignore that only a portion of the targets supported and none of the new families coming in to v2.0 support this. Our suggestion is a workable option for right now that covers non-hard-realtime systems. It's even a question to ask: should BMD bend over backwards to support hard real-time applications or, given the problems we keep trying to point out, should it be considered outside of the purview of the project? We understand that would be less than ideal for you, but until core fundamental problems with this approach are fixed it would not be mergable.

We must not only fix the incorrect heuristic governing when stopless operations are attempted, but figure out a way of providing the functionality wanted without it costing a ton of extra Flash space and being made wrongly available for targets which then don't just not work, but actually produce ADIv5 FAULT conditions on their debug interfaces and permanently go undebuggable till power cycled.

Another point towards that is that we need to catalogue which Cortex-M cores types are you using that are working. We already know at least 2 that do not support this operation - it's not that they just bus cycle steal or give WAIT, they crash the debug interface so must be avoided. If you can tell us which cores you are using that do work, it'll help build this picture up.