blackmagic-debug / blackmagic

In application debugger for ARM Cortex microcontrollers.
GNU General Public License v3.0
3.29k stars 774 forks source link

Cannot access global memory with stm32H743 target #701

Closed bobbatcomcastdotnet closed 2 years ago

bobbatcomcastdotnet commented 4 years ago

I am having a problem with my BMP accessing global variables. My hardware is based on the STM32H743 chip. The application I am debugging is rather large (~300K code). I am using a native BMP with upgraded firmware:

Black Magic Probe (Firmware 54ee00b) (Hardware Version 3) Copyright (C) 2015 Black Sphere Technologies Ltd. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html

The same problem occurs with a Bluepill version of the BMP.

The error GDB returns is "access to memory location 0xxxxx is denied". However, if I do a memory dump I can access that memory address without problems.

This problem does not always occur and it only seems to happen when accessing global variables, particularly structures. I am using the GDB 'print /x variable` command to access the variables. In addition I have only seen this problem occur when debugging the STM32H743 code. I have never seen this problem when debugging my STM32F407 based code which is functionally the same as the STM32H743 version.

One final note the jlink probe does not have this problem. It always displays the variables correctly in the same environment.

UweBonnes commented 4 years ago

@sidprice: NUCLEO-H7 board are easily available

@ bobbatcomcastdotnet It is a pity the Gareth is not around. I remember that he once mentioned MI2. I do not know if that was in the context of RTOS support or wit debugging. If you suspect problems with the GDB protocol, in gdb enable protocoll debugging bu "set debug remote 1" to the the packet exchange.

sidprice commented 4 years ago

@bobbatcomcastdotnet A while ago @UweBonnes asked the following:

When you have that problem via 'print /x variable` , resolve the address via "p &variable" and try to dump them memory via "x address"

I did not seen any answer to that ... and I tend to agree with @UweBonnes that this looks like either a GDB issue, or at least GDB having issues with the particular ELF file.

Maybe upload the ELF file and tell us a variable you have issues with.

bobbatcomcastdotnet commented 4 years ago

@UweBonnes I will try adding "set debug remote 1" to my GDB setup and see what happens.

@sidprice The question @UweBonnes asked:

When you have that problem via 'print /x variable` , resolve the address via "p &variable" and try to dump them memory via "x address"

was answered in the original post by:

The error GDB returns is "access to memory location 0xxxxx is denied". However, if I do a memory dump I can access that memory address without problems.

and I tend to agree with @UweBonnes that this looks like either a GDB issue, or at least GDB having issues with the particular ELF file.

If this were a GDB issue why would the jlink probe not show the exact same problem? That is exactly why I wanted to try the jlink probe before I reported this issue.

sidprice commented 4 years ago

@bobbatcomcastdotnet It is mysterious that BMP treats these variables differently, just struggling to understand why.

I feel there is something missing information-wise, just don't know what that may be, or what questions to ask to elicit the information, sorry.

bobbatcomcastdotnet commented 4 years ago

@sidprice Nothing to be sorry about at all. At the beginning of my IDE development project I spent a great deal of time studying the RSP protocol and the BMPs implementation of it and yet I too am baffled by this issue.

sidprice commented 4 years ago

I have ordered nucleo-h743 from Mouser, probably won't get here until Thursday, probably quicker than China :)

bobbatcomcastdotnet commented 4 years ago

@sidprice "probably quicker than China :)"

Anything is quicker than China :)

The only problem I see with this board is the way it is clocked. It does not include a crystal for the HSE oscillator. Instead it uses the 8mHz MCO output from the on-board ST-LINK. That means it will not work with my code as is :(

I have one of these boards so I will try to produce an example app that demonstrates the issue.

bobbatcomcastdotnet commented 4 years ago

@UweBonnes This is a repeat of a post I sent to Discord. I wanted to post it here to continue with this issue.

"I really don't think the MI interface to GDB in and of itself is the problem I experienced with the H7. But maybe the GDB issues a different set of RSP commands when MI is being used and it is those RSP commands that are not being handled properly. In any case the key to finding the problem lies in determining exactly what RSP packets are being sent to the BMP and what the BMP responds with when the problem occurs. Since it appears that the BMP-hosted application for Windows does not seem to work a different method of collecting this information is necessary. When I was developing my IDE I once had some code that inserted a set of virtual COM ports between GDB and the BMP so I could capture the actual RSP packet communications. Perhaps I can resurrect that code so that we can find out what RSP packets are being passed between GDB and BMP when the problem occurs."

sidprice commented 4 years ago

@bobbatcomcastdotnet My Nucleo is supposed to arrive today, assuming I can believe FedEx. Do you have a small program to reproduce?

UweBonnes commented 4 years ago

I have also H7 boards to test, so send to

bobbatcomcastdotnet commented 4 years ago

@sidprice I have a problem with the default oscillator source for the system clock on the Nucleo-H743ZI2 board. As I mentioned earlier that board does not have a crystal for HSE it uses the 8Mhz MCO output from the built-in ST-LINK. I have tried to create a program run on that board but so far I have not had any success. My app gets stuck in a tight loop waiting for the HSE oscillator to start. I will keep trying as time permits and perhaps you will have better luck getting something to work. The goal is to get a clock running at 480mHz.

bobbatcomcastdotnet commented 4 years ago

I have collected more data about this problem. I added a 'set debug remote 1' command so GDB would return more information.

Here is a transaction for reading the contents of structure pointed to by phost:

Sent by: IDE at 3:05 PM (GetVariableValue): print /x *phost

Sent by: GDB at 3:05 PM &"print /x *phost\n" Sent by: GDB at 3:05 PM &"Sending packet: $m240005a8,200#ef..." Sent by: GDB at 3:05 PM &"Ack\n" Sent by: GDB at 3:05 PM &"Packet received: 00000100000040000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\n" Sent by: GDB at 3:05 PM &"Sending packet: $m240007a8,200#f1..." Sent by: GDB at 3:05 PM &"Ack\n" Sent by: GDB at 3:05 PM &"Packet received: E01\n" Sent by: GDB at 3:05 PM &"Cannot access memory at address 0x240007a8\n" Sent by: GDB at 3:05 PM ^error,msg="Cannot access memory at address 0x240007a8" Sent by: GDB at 3:05 PM (gdb) Sent by: GDB at 3:05 PM

Raw response=> &"print /x *phost\n"&"Sending packet: $m240005a8,200#ef..."&"Ack\n"&"Packet received: 00000100000040000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\n"&"Sending packet: $m240007a8,200#f1..."&"Ack\n"&"Packet received: E01\n"&"Cannot access memory at address 0x240007a8\n"^error,msg="Cannot access memory at address 0x240007a8"(gdb)

Note that the second block of data at address 0x240007a8 returned a E01 error packet.

Here is another transaction which is a memory dump of address 0x240007a8 length = 512 bytes:

Sent by: IDE at 3:12 PM (ReadMemory): -data-read-memory-bytes 0x240007a8 512

Sent by: GDB at 3:12 PM &"Sending packet: $m240007a8,200#f1..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: E01\n" Sent by: GDB at 3:12 PM &"Sending packet: $m240007a8,1#90..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: 00\n" Sent by: GDB at 3:12 PM &"Sending packet: $m240007a9,ff#2c..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\n" Sent by: GDB at 3:12 PM &"Sending packet: $m240008a8,80#c8..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\n" Sent by: GDB at 3:12 PM &"Sending packet: $m24000928,40#96..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\n" Sent by: GDB at 3:12 PM &"Sending packet: $m24000968,20#98..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: E01\n" Sent by: GDB at 3:12 PM &"Sending packet: $m24000968,10#97..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: 00000000000000000000000000000000\n" Sent by: GDB at 3:12 PM &"Sending packet: $m24000978,8#6f..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: 0000000000000000\n" Sent by: GDB at 3:12 PM &"Sending packet: $m24000980,4#64..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: 00000000\n" Sent by: GDB at 3:12 PM &"Sending packet: $m24000984,2#66..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: 0000\n" Sent by: GDB at 3:12 PM &"Sending packet: $m24000986,1#67..." Sent by: GDB at 3:12 PM &"Ack\n" Sent by: GDB at 3:12 PM &"Packet received: 00\n" Sent by: GDB at 3:12 PM ^done,memory=[{begin="0x240007a8",offset="0x00000000",end="0x24000988",contents="000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000030"}] Sent by: GDB at 3:12 PM (gdb) Sent by: GDB at 3:12 PM

Raw response=> &"Sending packet: $m240007a8,200#f1..."&"Ack\n"&"Packet received: E01\n"&"Sending packet: $m240007a8,1#90..."&"Ack\n"&"Packet received: 00\n"&"Sending packet: $m240007a9,ff#2c..."&"Ack\n"&"Packet received: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\n"&"Sending packet: $m240008a8,80#c8..."&"Ack\n"&"Packet received: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\n"&"Sending packet: $m24000928,40#96..."&"Ack\n"&"Packet received: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\n"&"Sending packet: $m24000968,20#98..."&"Ack\n"&"Packet received: E01\n"&"Sending packet: $m24000968,10#97..."&"Ack\n"&"Packet received: 00000000000000000000000000000000\n"&"Sending packet: $m24000978,8#6f..."&"Ack\n"&"Packet received: 0000000000000000\n"&"Sending packet: $m24000980,4#64..."&"Ack\n"&"Packet received: 00000000\n"&"Sending packet: $m24000984,2#66..."&"Ack\n"&"Packet received: 0000\n"&"Sending packet: $m24000986,1#67..."&"Ack\n"&"Packet received: 00\n"^done,memory={begin="0x240007a8",offset="0x00000000",end="0x24000988",contents="000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000030"}

Sent by: DEBUG at 3:12 PM

Response was processed here. Last command = ReadMemory <<<<<<

Note that GDB received an error from the BMP but retried using different data lengths until the entire block was received.

Just more info...

UweBonnes commented 4 years ago

I can somehow reproduce: Test setup: Nucleo-H743V, Firmware V2J33. I start gdb, attach to the CPU and run: (gdb) x /128x 0x240007a8

Reading fails ay different places. e.g. like gdb_getpacket : m24000980,4 m packet: addr = 24000980, len = 4 ap_mem_write_sized @ e000ef68 len 4, align 4: 80 09 00 24 Send ( 16): f20868ef00e00400.0000000000000000. Send ( 4): 80090024 Send ( 16): f23e000000000000.0000000000000000. Rec (12/12)1900000068ef00e0.0000 0000 Send ( 16): f23e000000000000.0000000000000000. Rec (12/12)1900000068ef00e0.00000000 STLINK_SWD_AP_STICKY_ERROR Send ( 16): f207800900240400.0000000000000000. Rec (4/4)30edcda8 Send ( 16): f23e000000000000.0000000000000000. Rec (12/12)8000000068ef00e0.00000000 stlink_readmem from 24000980 to 857e5e40, len 4 ap_memread @ 24000980 len 4: 30 ed cd a8 Send ( 16): f245ffff04000000.0000000000000000. Rec (8/8)80000000400000f8 Write DP_ABORT : 0x00000000 Send ( 16): f246ffff00000000.0000000000000000. Rec (2/2)8000 DP Error 0x00000001

stm32mcuprog can read the area repeatedly.

Similar with BMP : (gdb) x /128x 0x240007a8 ... Packet received: E01 0x24000858: 0x51bfb512 0x9c910cb8 Cannot access memory at address 0x24000860

and BMP remote: remote_ap_mem_read returned REMOTE_RESP_ERR at apsel 0, addr: 0x24000980

I can not get openocd or pyocd to work on that board.

UweBonnes commented 4 years ago

Running BMP hosted with high level disabled shows that the AP returns FAULT swdptap_seq_in 3 ticks: 00000001 vs OK swdptap_seq_in 3 ticks: 00000004:

Write AP_CSW : 0x0b800052 Write AP_TAR : 0x240007a8 Read AP_DRW : 0x00000000 swdptap_seq_out 8 ticks: 000000bd !So08bd# K0 !Si03# K4 swdptap_seq_in 3 ticks: 00000004 Read DP_RDBUFF: 0x00000000 ap_memread @ 240007a8 len 4: 00 00 00 00 swdptap_seq_out 8 ticks: 0000008d !So088d# K0 !Si03# K1 swdptap_seq_in 3 ticks: 00000001

It would be interesting what other debugger do. !SI20# Kf8000020 swdptap_seq_in_parity 32 ticks: f8000020 OK Write DP_ABORT : 0x00000004 swdptap_seq_out 8 ticks: 00000081 !So0881# K0 !Si03# K1 swdptap_seq_in 3 ticks: 00000001 swdptap_seq_out_parity 32 ticks: 00000004 !SO204# K0 swdptap_seq_out 2 ticks: 00000000 !So020# K0 DP Error 0x00000020 gdb_putpacket : E01

bobbatcomcastdotnet commented 4 years ago

@UweBonnes Does this mean that the problem is confirmed via independent sources?

UweBonnes commented 4 years ago

Can you try out: diff --git a/src/target/cortexm.c b/src/target/cortexm.c index 10ea6352..ca3cb98e 100644 --- a/src/target/cortexm.c +++ b/src/target/cortexm.c @@ -241,7 +241,7 @@ static void cortexm_cache_clean(target *t, target_addr addr, size_t len, bool in

  static void cortexm_mem_read(target *t, void *dest, target_addr src, size_t len)
  {
-       cortexm_cache_clean(t, src, len, false);
+//     cortexm_cache_clean(t, src, len, false);
        adiv5_mem_read(cortexm_ap(t), dest, src, len);
 }

cache clean is done with each DWORD read and does cause the sticky errors or FAULTS. This may lead to data inconsistancy, but at least reading works for me. We need people with deep understanding of the cache to fix that.

bobbatcomcastdotnet commented 4 years ago

@UweBonnes I have no idea what your last comment means???

UweBonnes commented 4 years ago

Even if you do not understand, try out to comment out the line above, recompile, eventually reflash the firmware, test and report.

UweBonnes commented 4 years ago

Better: Try https://github.com/UweBonnes/blackmagic/commits/cortexm_romtable This wakes up all debug units and I do no longer see those errors.

bobbatcomcastdotnet commented 4 years ago

@UweBonnes Is there a pre-built .bin file somewhere that contains the BMP version you want me to try?

UweBonnes commented 4 years ago

Only for master, there are prebuilt binaries. As you have a arm toolchain, what keeps you from compiling yourself? bmp.zip

bobbatcomcastdotnet commented 4 years ago

@UweBonnes I updated by BMP with the bin file you sent. It reports the version as:

Black Magic Probe (Firmware v1.6.1-560-g1649be00) (Hardware Version 3) Packet received: O436f707972696768742028432920323031352020426c61636b2053706865726520546563686e6f6c6f67696573204c74642e0a Copyright (C) 2015 Black Sphere Technologies Ltd. Packet received: O4c6963656e73652047504c76332b3a20474e552047504c2076657273696f6e2033206f72206c61746572203c687474703a2f2f676e752e6f72672f6c6963656e7365732f67706c2e68746d6c3e0a0a License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html

I tried my app but got the same results as before. Is there something I need to send to GDB for this version to utilize the cortexm_romtable?

UweBonnes commented 4 years ago

Argh, you are right. I read the RAM from the command line "./blackmagic -a 0x24000000 -r /tmp/bla -v1" After powerup several rund fail. Then at some point, thing work as expected until I un- and repower the H7 again. Probably I was in the good state when I tested the branch.

It would be helpfull when you answer all question: what keeps you from compiling yourself?

With the command removed that clean the cache when reading, even a run after power up succeeds bmp.zip

bobbatcomcastdotnet commented 4 years ago

@UweBonnes You asked "It would be helpfull when you answer all question: what keeps you from compiling yourself?" There is really nothing keeping me from compiling new/test code myself if I know where to obtain the sources. The link you gave me was to the cortexm_romtable page. I am not familiar with the branching setup used on github so what you gave me as a link was not sufficient for me to determine where the source code was that I needed to build. If, in the future, you would like me to build a version of the code please supply me a link to the sources so that I can download them and perform the make. Don't make the assumption that I know what you know about how all this branching stuff works.

bobbatcomcastdotnet commented 4 years ago

@UweBonnes That version of the bin file made all the difference 👍 I am now able to display the structure data successfully. Please let me know when the fix is committed to the master source so I can download the latest code.

UweBonnes commented 4 years ago

It is a workaround and may lead to a difference what the CPU sees and what you readout. The probable solution is to clean the cache on halt and invalidate on resume.

bobbatcomcastdotnet commented 4 years ago

@UweBonnes You said "may lead to a difference what the CPU sees and what you readout". Does that mean the data I get back from GDB for the variable may be incorrect? If so, that may be worse than not being able to display the data at all 👎

UweBonnes commented 4 years ago

So I said it is a workaround. But at least we now know where it fails, but not yet why.

H7 has 3(4?) ways and 127(128?) sets. Cleaning on halt and invalidating on resume will take to transfer 3127 or perhaps even 4 128 writes to the SCB. Probably a big penalty. Some more thoughts are needed.

bobbatcomcastdotnet commented 4 years ago

@UweBonnes, Has there been any progress on this issue? The BMP is currently useless for me with this issue so I would like it to be addressed ASAP...

UweBonnes commented 4 years ago

BMP is a community effort. You may hope that somebody fixes a problem you encounter, but there is no guarantee. Best is, you propose a fix yourself in a patch request or issue.

Well, nobody else chimed in and presented ideas.

UweBonnes commented 4 years ago

I have had a look again at the problem. I can not reproduce when printing simple memory values, like p /x (unsigned int ) 0x24000200 But I can reproduce when reading larger areas like x /64x 0x24000200 This flushes multiple cache line. Perhaps the flushing happens to fast. B.t.w. a second read with read more and so reading multiple time succeeds at last.

So the question to you: Did the problem happen also when reading variable with sizeof() < 32. Or were the variables larger?

If it only happens in the latter case, perhaps some code rearrangement can help.

UweBonnes commented 4 years ago

Can you please try if #731 fixes your problem?

bobbatcomcastdotnet commented 4 years ago

@UweBonnes I do not have time to test this issue right now but I may get to it tomorrow. Please provide a link to the source code branch this fix has been applied to so that I can build it.

sidprice commented 4 years ago

@bobbatcomcastdotnet I think this is the URL you need.

https://github.com/UweBonnes/blackmagic/tree/swd_fault

bobbatcomcastdotnet commented 4 years ago

@sidprice Thanks Sid I'll give it a try sometime tomorrow.

UweBonnes commented 4 years ago

@bobbatcomcastdotnet

Can you answer to: "So the question to you: Did the problem happen also when reading variable with sizeof() < 32. Or were the variables larger?"

bobbatcomcastdotnet commented 4 years ago

@UweBonnes The answer is the variable was larger then 32. It is a USB HID host structure which is fairly large. I built the BMP with the code Sid provided a link to and it appears to fix the problem :) At least I don't get the error with this code. Only time will tell if the fix is allowing an accurate display of the data values but at least things look promising.

bobbatcomcastdotnet commented 4 years ago

@UweBonnes Will this fix be available in the 1.6.2 release?

UweBonnes commented 4 years ago

Yes, applied to git head.

esden commented 2 years ago

Considering that there is a patch merged now, and some time passed without an objection that the issue is not fixed yet I am closing this issue. If the problem persists please feel free to reopen this issue.