Closed mbilal100 closed 1 year ago
There are many odd errors in your OpenOCD log.
Search "error" (35 hits in 1 file of 1 searched)
C:\Users\Tommy Murphy\Downloads\polarfiresoc\bug-report\bug-report\openocd (35 hits)
Line 58: Error: The 'halt' command must be used after 'init'.
Line 59: Error executing event gdb-attach on target mpfs.hart1_u54_1:
Line 13417: Error: 13441 273030 riscv.c:841 riscv_remove_breakpoint(): Failed to restore instruction for 2-byte breakpoint at 0xffffffff8000c00e
Line 22794: Debug: 22819 276410 riscv-013.c:3169 read_memory_progbuf(): error reading single word of 1 bytes from 0xffffffff8000c00e
Line 22815: Debug: 22840 276414 riscv-013.c:3169 read_memory_progbuf(): error reading single word of 1 bytes from 0xffffffff8000c00f
Line 70805: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70806: Error: dmi_scan failed jtag scan
Line 70807: Error: failed read at 0x10, status=2
Line 70809: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70810: Error: failed jtag scan: -600
Line 70811: Error: Unsupported DTM version: 8
Line 70813: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70814: Error: dmi_scan failed jtag scan
Line 70815: Error: failed read at 0x10, status=2
Line 70817: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70818: Error: failed jtag scan: -600
Line 70819: Error: Unsupported DTM version: 8
Line 70821: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70822: Error: dmi_scan failed jtag scan
Line 70823: Error: failed read at 0x10, status=2
Line 70825: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70826: Error: failed jtag scan: -600
Line 70827: Error: Unsupported DTM version: 8
Line 70829: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70830: Error: dmi_scan failed jtag scan
Line 70831: Error: failed read at 0x10, status=2
Line 70833: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70834: Error: failed jtag scan: -600
Line 70835: Error: Unsupported DTM version: 8
Line 70837: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70838: Error: dmi_scan failed jtag scan
Line 70839: Error: failed read at 0x10, status=2
Line 70841: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70842: Error: failed jtag scan: -600
Line 70843: Error: Unsupported DTM version: 8
Some or all of these could potentially be caused by (mis)configuration of the PolarFire SoC MSS and/or problems with your Embedded FlashPro6 setup.
You probably need to contact Microchip for support on this.
Especially since the version of OpenOCD used is quite old compared to the latest riscv-openocd
version and is also a custom version.
C:\Microchip\SoftConsole-v2022.2-RISC-V-747\openocd\bin>openocd --version
xPack OpenOCD (Microchip SoftConsole build), x86_64 Open On-Chip Debugger 0.10.0+dev-00859-g95a8cd9b5-dirty (2022-03-15-14:08)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Hi @TommyMurphyTM1234 sorry but you can ignore all following errors as board had been powered off (GDB has already disconnected)
Line 70805: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70806: Error: dmi_scan failed jtag scan
Line 70807: Error: failed read at 0x10, status=2
Line 70809: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70810: Error: failed jtag scan: -600
Line 70811: Error: Unsupported DTM version: 8
Line 70813: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70814: Error: dmi_scan failed jtag scan
Line 70815: Error: failed read at 0x10, status=2
Line 70817: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70818: Error: failed jtag scan: -600
Line 70819: Error: Unsupported DTM version: 8
Line 70821: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70822: Error: dmi_scan failed jtag scan
Line 70823: Error: failed read at 0x10, status=2
Line 70825: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70826: Error: failed jtag scan: -600
Line 70827: Error: Unsupported DTM version: 8
Line 70829: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70830: Error: dmi_scan failed jtag scan
Line 70831: Error: failed read at 0x10, status=2
Line 70833: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70834: Error: failed jtag scan: -600
Line 70835: Error: Unsupported DTM version: 8
Line 70837: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70838: Error: dmi_scan failed jtag scan
Line 70839: Error: failed read at 0x10, status=2
Line 70841: Error: Embedded FlashPro6 (revision B) failed to send the data. Programmer device reset is required. Err = -1
Line 70842: Error: failed jtag scan: -600
Line 70843: Error: Unsupported DTM version: 8
Yes, but I still think that you probably need to contact Microchip for support on this. They use a customised version of OpenOCD.
Also - this early error may be relevant:
Error: The 'halt' command must be used after 'init'.
Error executing event gdb-attach on target mpfs.hart1_u54_1:
Looks, to me, like something fails with the GDB attachment to the u54_1
hart, so all debugging bets are probably off from there on in?
Hi @TommyMurphyTM1234
Looks, to me, like something fails with the GDB attachment to the
u54_1
hart, so all debugging bets are probably off from there on in?
okay, I've managed to remove this early error (previously I was launching openocd with noinit
command line argument and then init
it through telnet but now let the openocd do it init
) but I'm still getting breakpoint removal error.
/var/mbilal/SoftConsole-v2022.2-RISC-V-747/openocd/bin/openocd -c 'gdb_port 3333' -c 'telnet_port 4444' -c 'tcl_port disabled' -c 'set DEVICE MPFS' -f board/microsemi-riscv.cfg
xPack OpenOCD (Microchip SoftConsole build), x86_64 Open On-Chip Debugger 0.10.0+dev-00859-g95a8cd9b5-dirty (2022-03-15-14:04)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
MPFS
Info : only one transport option; autoselect 'jtag'
Info : Hardware thread awareness created
do_board_reset_init
Info : tcl server disabled
Info : Listening on port 4444 for telnet connections
Info : Embedded FlashPro6 (revision B) found (USB_ID=1514:200b path=/dev/hidraw3)
Info : Embedded FlashPro6 (revision B) CM3 firmware version: F4.0
Info : clock speed 6000 kHz
Info : JTAG tap: mpfs.cpu tap/device found: 0x0f81a1cf (mfg: 0x0e7 (GateField), part: 0xf81a, ver: 0x0)
Info : datacount=2 progbufsize=16
Info : Disabling abstract command reads from CSRs.
Info : Core 1 could not be made part of halt group 1.
Info : Examined RISC-V core; found 5 harts
Info : hart 0: currently disabled
Info : hart 1: XLEN=64, misa=0x800000000014112d
Info : hart 2: currently disabled
Info : hart 3: currently disabled
Info : hart 4: currently disabled
Info : datacount=2 progbufsize=16
Info : Disabling abstract command reads from CSRs.
Info : Core 2 could not be made part of halt group 1.
Info : Examined RISC-V core; found 5 harts
Info : hart 0: currently disabled
Info : hart 1: currently disabled
Info : hart 2: XLEN=64, misa=0x800000000014112d
Info : hart 3: currently disabled
Info : hart 4: currently disabled
Info : datacount=2 progbufsize=16
Info : Disabling abstract command reads from CSRs.
Info : Core 3 could not be made part of halt group 1.
Info : Examined RISC-V core; found 5 harts
Info : hart 0: currently disabled
Info : hart 1: currently disabled
Info : hart 2: currently disabled
Info : hart 3: XLEN=64, misa=0x800000000014112d
Info : hart 4: currently disabled
Info : datacount=2 progbufsize=16
Info : Disabling abstract command reads from CSRs.
Info : Core 4 could not be made part of halt group 1.
Info : Examined RISC-V core; found 5 harts
Info : hart 0: currently disabled
Info : hart 1: currently disabled
Info : hart 2: currently disabled
Info : hart 3: currently disabled
Info : hart 4: XLEN=64, misa=0x800000000014112d
Info : datacount=2 progbufsize=16
Info : Disabling abstract command reads from CSRs.
Info : Examined RISC-V core; found 5 harts
Info : hart 0: XLEN=64, misa=0x8000000000101105
Info : hart 1: currently disabled
Info : hart 2: currently disabled
Info : hart 3: currently disabled
Info : hart 4: currently disabled
Info : Listening on port 3333 for gdb connections
Info : Listening on port 3334 for gdb connections
Info : accepting 'telnet' connection on tcp/4444
Info : accepting 'gdb' connection on tcp/3333
Info : New GDB Connection: 1, Target mpfs.hart1_u54_1, state: halted
Info : Disabling abstract command writes to CSRs.
Info : Disabling abstract command writes to CSRs.
Info : Disabling abstract command writes to CSRs.
Info : Disabling abstract command writes to CSRs.
Error: Failed to restore instruction for 2-byte breakpoint at 0xffffffff8000c00e
Please note that as described in issue description, I've to remain idle 3-4 min to reproduce this error.
As I said, I think you need to take this up with Microchip support.
As I said, I think you need to take this up with Microchip support.
* https://github.com/orgs/polarfire-soc/discussions
Right, I have opened a discussion https://github.com/orgs/polarfire-soc/discussions/284
Thank you for your help!
humm, I inspect the dcsr
registers of each hart, and found that, one of the hart is running in M-mode (IIUC, riscv don't allow MMU in M-mode) that's why memory is not being accessible on that hart, right?
Here are the values of mstatus
, sstatus
, priv
, and dcsr
of each hart when issue is occurred.
> targets mpfs.hart1_u54_1
> reg mstatus
mstatus (/64): 0x0000000A00000920
> reg priv
priv (/8): 0x03
> reg sstatus
sstatus (/64): 0x0000000200000120
> reg dcsr
dcsr (/64): 0x000000004000B0C3
> targets mpfs.hart2_u54_2
> reg mstatus
mstatus (/64): 0x0000000A000000A2
> reg priv
priv (/8): 0x01
> reg sstatus
sstatus (/64): 0x0000000200000022
> reg dcsr
dcsr (/64): 0x000000004000B041
>
> targets mpfs.hart3_u54_3
> reg mstatus
mstatus (/64): 0x0000000A00000820
> reg priv
priv (/8): 0x03
> reg sstatus
sstatus (/64): 0x0000000200000020
> reg dcsr
dcsr (/64): 0x000000004000B0C3
> targets mpfs.hart4_u54_4
> reg mstatus
mstatus (/64): 0x0000000A000000A2
> reg priv
priv (/8): 0x01
> reg sstatus
sstatus (/64): 0x0000000200000022
> reg dcsr
dcsr (/64): 0x000000004000B041
But why openocd is not working well in m-mode ? well else flags I need to set to overcome this issue in openocd?
Could something like PMP configuration be locking the affected hart out of access to the relevant memory?
Also, as I alluded to before, the Microchip PolarFire SoC MSS has all sorts of configuration options that could potentially affect memory access from a hart.
What MSS/FPGA configuration and Linux source base are you using?
IIUC, riscv don't allow MMU in M-mode
It can in some cases as far as I recall... E.g. see here:
Could something like PMP configuration be locking the affected hart out of access to the relevant memory?
IIUC, I didn't find any PMP enabled config flag in MSS and HSS of polarfire board.
Also, as I alluded to before, the Microchip PolarFire SoC MSS has all sorts of configuration options that could potentially affect memory access from a hart.
Sorry, but I've already opened a discussion at Microchip forum https://www.microchip.com/forums/m1224824.aspx#1224824 https://github.com/orgs/polarfire-soc/discussions/284
What MSS/FPGA configuration and Linux source base are you using?
I'm using pre-build base design from https://github.com/polarfire-soc/icicle-kit-reference-design/releases/tag/v2022.09
and default branch of linux from here https://github.com/linux4microchip/linux
Can any one please confirm whether memory read/write is working fine in M-mode in latest openocd?
For me it clearly not working in microchip packaged openocd even after setting riscv set_enable_virt2phys off
.
Can any one please confirm whether memory read/write is working fine in M-mode in latest openocd?
Yes, it is. If it wasn't then very little would work in terms of RISC-V debugging.
For me it clearly not working in microchip packaged openocd even after setting
riscv set_enable_virt2phys off
.
On one hart. Which suggests that it may be something specific to the PolarFire SoC MSS/FPGA configuration rather than a general RISC-V OpenOCD issue. Even with the older and custom version of OpenOCD that Microchip ship with SoftConsole.
Just a thought... does the same issue manifest itself if you try to debug, say, the PolarFire SoC HSS (Hart Software Services: https://github.com/polarfire-soc/hart-software-services) in the absence of the actual kernel/Linux distro?
I think I understand the issue. GDB always sends the virtual address for memory read/write but openocd don't understand this virtual address in m-mode (MMU disable) and treat this address as physical and failed to read/write memory.
On the other hand If I give the translated physical address then memory read/write is successful in m-mode. i.e 0xffffffff8000bf3e -> 0x100020bf3e
> targets mpfs.hart1_u54_1
> reg priv
priv (/8): 0x03
> mpfs.hart1_u54_1 mdb 0xffffffff8000bf3e 2
0xffffffff8000bf3e: 00 00
> mpfs.hart1_u54_1 mdb 0x100020bf3e 2
0x100020bf3e: 02 90
I don't think latest upstream handle this use case as well May be it is always client responsibility to provide physical address in M-mode???
I thought that this scenario was what riscv set_enable_virtual off
was designed for?
For clarification of the differences between set_enable_virtual
and set_enable_virt2phys
maybe this helps?
If neither of those helps then could you maybe compile the kernel for physical addresses and use the resulting ELF file/debug info for debugging?
I thought that this scenario was what
riscv set_enable_virt2phys on|off
was designed for?
I don't think this option would help here. This option just make sure do we need address translation or not but address translation is not available in m-mode in any case. so openocd just try to read gdb given virtual address as normal memory and failed.
I already tried both options.
If neither of those fethishisation then could you maybe compile the kernel for physical addresses and use the resulting ELF file/debug info for debugging?
I think its not easy to build linux kernel without MMU.
Here is summary: ELF of physical address will always be work in any mode ELF of virtual addresses will work in S-mode ELF of virtual address will failed in M-mode (because openocd will always treat virtual address as normal address)
I thought that this scenario was what
riscv set_enable_virt2phys on|off
was designed for?I don't think this option would help here. This option just make sure do we need address translation or not but address translation is not available in m-mode in any case. so openocd just try to read gdb given virtual address as normal memory and failed.
I already tried both options.
Yes, but the Microchip OpenOCD is based off a quite old version of this repo's sources so maybe something changed in the implemention of these commands in the meantime to make them work in this scenario?
Glancing at the code in riscv-013.c
, it seems to me that OpenOCD changes the effective privilege mode temporarily if/when it needs to do virtual to physical address translation.
But, again, the Microchip OpenOCD may predate this functionality.
Could this be your issue and the Microchip OpenOCD is lacking the required fix?
Could this be your issue and the Microchip OpenOCD is lacking the required fix?
No, from the logs, openocd goes for mmu translation but failed with following msg. This same message is also available in latest openocd.
Debug: 13350 273008 riscv.c:1400 riscv_mmu(): SATP/MMU ignored in Machine mode (mstatus=0xa00000920).
So, what I understand, here is code follow
riscv_read_memory()
-> riscv_virt2phys()
-> riscv_mmu() --- failed (Not supported in m-mode)
-> read_memory() (so read_memory() is called with virtual address)
In that case, maybe the appropriate way to debug the kernel is on the target itself rather than "remotely" via OpenOCD/JTAG?
Or using one of the other kernel debugging approaches/tools?
In that case, maybe the appropriate way to debug the kernel is on the target itself rather than "remotely" via OpenOCD/JTAG?
yeah, we can use it. or may be a wrapper between gdb and openocd to auto translate the virtual address to physical in case of m-mode.
I'm not sure, if it is also feasible in openocd to always honor the translation of virtual address even in case of m-mode?
Maybe there's also something useful here?
This is really a gdb issue. gdb has no clue about address translation and even less that it could change depending on the current CPU mode.
I don't think we have any options to tell OpenOCD to perform address translation even when it's not configured in the current mode. Since half of what OpenOCD does is to work around limitations in gdb, it wouldn't be crazy to extend OpenOCD to optionally always force address translation. To do that right, it's probably best to combine riscv set_enable_virtual
and riscv set_enable_virt2phys
into a single command (riscv address_translation mprven|auto|always
).
Thanks a lot @timsifive.
I don't think we have any options to tell OpenOCD to perform address translation even when it's not configured in the current mode.
Can you explain what you mean by this, please?
Could you also explain in what circumstances OpenOCD can do virtual to physical address translation?
To do that right, it's probably best to combine
riscv set_enable_virtual
andriscv set_enable_virt2phys
into a single command (riscv address_translation mprven|auto|always
).
I guess that this is echoing what you previously posted here?
OpenOCD contains logic to perform virtual to physical address translation. However, that logic incorporates all the rules of RISC-V address translation. So in M-Mode it doesn't happen, because that's how address translation works.
Otherwise, by default, OpenOCD will assume that gdb is giving it virtual addresses and perform translation so that we can use physical addresses to access memory. (Usually address translation does not happen in Debug Mode.)
Hi @timsifive
OpenOCD contains logic to perform virtual to physical address translation. However, that logic incorporates all the rules of RISC-V address translation. So in M-Mode it doesn't happen, because that's how address translation works.
I'm sorry, but I'm still really confused. How does it ever work given that debug mode is effectively M mode, or, to be more precise, an even higher privileged mode than M mode? Do you mean that the code is there but will never actually work?
Otherwise, by default, OpenOCD will assume that gdb is giving it virtual addresses and perform translation so that we can use physical addresses to access memory. (Usually address translation does not happen in Debug Mode.)
Ditto.
Is it the case that OpenOCD would only be able to do virtual to physical address translation if it dynamically and temporarily switched to the relevant lower privileged mode? And that it would need more information about which mode this should be and about the virtual memory configuration perhaps? Information that is not dynamically available?
I'm sorry, but I'm still really confused. How does it ever work given that debug mode is effectively M mode, or, to be more precise, an even higher privileged mode than M mode? Do you mean that the code is there but will never actually work?
When a hart enters Debug Mode, the current privilege mode is save into dcsr.prv. When OpenOCD gets an address, it looks first at dcsr.prv to see if the hart was in a mode where address translation is enabled. If so, then it looks at the page table address, and performs the address translation just as the spec says the hardware would.
Ah, I see now. Thanks a lot @timsifive. I'll have a look at the code again for my own further edification. :-)
Candidate for closure?
If there is any issue extant here then it seems to me that it's a matter for Microchip and not any proven issue with riscv-openocd
?
While debugging Linux kernel via jtag on Polarfire icicle board, getting following error on GDB breakpoint removal packet
Error: Failed to restore instruction for 2-byte breakpoint at 0xffffffff8000c00e
Most of the time this error is come when i.e
I examined all the harts to see the memory access and found that first hart is not be able to read that breakpoint address memory.
telnet logs:
so hart 1 is not be able to access the memory at all.
I'm running following LTS linux
Linux icicle-kit-es-flex 5.15.79-linux4microchip+fpga-2022.09 #1 SMP Thu Nov 24 14:15:55 UTC 2022 riscv64 GNU/Linux
Linux cmdline
Enabled openocd
debug_level 3
and got following error on accessing the memoryI'm also attaching the complete openocd, telnet, gdb and openocd config logs.
Can anyone please suggest why I'm getting this error ?
bug-report.zip