blackmagic-debug / blackmagic

In application debugger for ARM Cortex microcontrollers.
GNU General Public License v3.0
3.27k stars 774 forks source link

Errors using SWD while trying to erase blocks on a Tiva-C target #958

Closed dragonmux closed 2 years ago

dragonmux commented 2 years ago

Using the latest source for BMP, we get the following errors with a native BMP over SWD when trying to erase blocks on a target (TM4C123GH6PM) but not over JTAG:

❯ src/blackmagic -E 0x2000
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x40000ff8
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x40043200
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x40048024
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x4004804c
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x400e0940
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x400e0740
❯ src/blackmagic -j -E 0x2000
❯ 

Additionally, if this is immediately followed by an info dump over SWD the following additional errors are generated:

❯ src/blackmagic -t
INFO: Open USB 046d:082d class ef failed
BMP hosted v1.7.1-243-g9cfc854-dirty
 for ST-Link V2/3, CMSIS_DAP, JLINK and LIBFTDI/MPSSE
Using 1d50:6018 8BB209F2 Black Sphere Technologies
 Black Magic Probe  c50f317-dirty
Running in Test Mode
Target voltage: 3.3V Volt
Speed set to  3.2727 MHz for SWD
Exception: SWDP invalid ACK
Trying old JTAG to SWD sequence
DPIDR 0x2ba01477 (v1 rev2)
RESET_SEQ failed
AP   0: IDR=24770011 CFG=00000000 BASE=e00ff003 CSW=23000040 (AHB-AP var1 rev2
Halt via DHCSR: success 00030003 after 1ms
ROM: Table BASE=0xe00ff000 SYSMEM=0x00000001, designer 43b Partno 4c4
0 0xe000e000: Generic IP component - Cortex-M4 SCS (System Control Space) (PIDR = 0x04000bb00c  DEVTYPE = 0x00 ARCHID = 0x0000)-> cortexm_probe
CPUID 0x410fc241 (M4 var 0 rev 1)
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x40000ff8
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x40043200
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x40048024
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x4004804c
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x400e0940
remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x400e0740
1 0xe0001000: Generic IP component - Cortex-M3 DWT (Data Watchpoint and Trace) (PIDR = 0x04003bb002  DEVTYPE = 0x00 ARCHID = 0x0000)
2 0xe0002000: Generic IP component - Cortex-M3 FBP (Flash Patch and Breakpoint) (PIDR = 0x04002bb003  DEVTYPE = 0x00 ARCHID = 0x0000)
3 0xe0000000: Generic IP component - Cortex-M3 ITM (Instrumentation Trace Module) (PIDR = 0x04003bb001  DEVTYPE = 0x00 ARCHID = 0x0000)
4 0xe0040000: Debug component - Cortex-M4 TPIU (Trace Port Interface Unit) (PIDR = 0x04000bb9a1  DEVTYPE = 0x11 ARCHID = 0x0000)
5 0xe0041000: Debug component - Cortex-M4 ETM (Embedded Trace) (PIDR = 0x04000bb925  DEVTYPE = 0x13 ARCHID = 0x0000)
ROM: Table END
***  1      TI Stellaris/Tiva M4
RAM   Start: 0x20000000 length = 0x10000
Flash Start: 0x00000000 length = 0x80000 blocksize 0x400

Except for "RESET_SEQ failed", the JTAG version of this cleanly runs and succeeds.

UweBonnes commented 2 years ago

DX-MON, can you please check or report if this really is an error/bug? Does erase not happen? Is the device locked-up afterwards? For me it seems, addresses (are /need to be?) probed wildly and access to some of these adresses fail, but in the end the device is (hopefully) detected right? Jtag reporting of access problems/errors probably happens in some other way, perhaps not yet evaluated right or not yet reported. Access errors are the price we have to pay that BMP autodetects devices. Perhaps a better classification is needed/helpfull.

dragonmux commented 2 years ago

The erase does not happen, this is a bug and we carefully checked what we found before posting and classifying it. It is classified correctly as these issues do not happen when using BMP native itself from GDB - this is a bug in hosted.

Additionally, the addresses selected are definitely not random (they are wrong, but not random). We will be getting @esden hardware to test this more fully as part of the CI setup and looking into it once we've finished another feature that we're in process of writing.

UweBonnes commented 2 years ago

Can you bisect?

UweBonnes commented 2 years ago

Maybe this does not fix the erase problem. But a bad access can upset the core.

Setting a breakpoint to the emitted warnings shows:

#3  0x000000000041d36e in lpc546xx_probe (t=t@entry=0x6130000003c0) at target/lpc546xx.c:105
#4  0x0000000000411da1 in cortexm_probe (ap=ap@entry=0x604000003a10) at target/cortexm.c:458
#5  0x00000000004053f9 in adiv5_component_probe (ap=ap@entry=0x604000003a10, addr=addr@entry=3758153728, recursion=recursion@entry=1, 

So while probing for LPCxx, the TMC gets upset. We can get around the "remote_ap_mem_read" warnings by probing before LPC, but then LPC get upset. I just tested. That's a catch22. How to resolve? Should we give scan a hint like "mon swd lmi"?

DX-MON can you test the erase with https://github.com/UweBonnes/blackmagic/tree/lmi?