Open patrislav1 opened 4 years ago
Here's a Wireshark capture of the pyocd
<---> gdb
traffic. Can't see anything that isn't already shown in the pyocd
log.
If the 120 value gdb complains about, is not hex, but decimal, then the x
in $Text=0;Data=0;Bss=0#04
could be the offending character. However in gdb remote protocol this looks like a valid response to qOffsets
.
Fun Fact: OpenOCD uses the very same reply string to answer gdb's qOffsets
command, but its session doesn't get shut down. I see zero difference. Maybe it uses a different GDB session setup?
The watchdog timer of the SAM4L MCU was still active (you can configure it to stop while debugging, but I forgot that), so when pyocd connected and halted the chip, it ran into timeout. Before I could connect gdb
to pyocd
, it already went through watchdog reset, losing the connection to the debugger.
When I disable the watchdog, pyocd gdbserver
works as expected.
Losing connection on watchdog reset could still be an interesting edge case for pyocd
to catch and display a more helpful error message.
Hi @patrislav1 , thanks for all the logs! The "Reply contains invalid hex digit 120" reported by gdb is definitely a bug somewhere. Sometimes gdb reports errors for packets prior to the most recently received one.
Why would the debug connection be lost on a watchdog reset? That should be a warm reset, which shouldn't reset debug logic according the the v7-M specification. But it's not uncommon for silicon vendors to mess up reset implementation on MCU. I don't know enough about the SAM4L to know whether this is the case here.
Reopening this issue due to the invalid hex digit error.
Why would the debug connection be lost on a watchdog reset? That should be a warm reset, which shouldn't reset debug logic according the the v7-M specification. But it's not uncommon for silicon vendors to mess up reset implementation on MCU.
I can only speculate that the watchdog reset works differently than other resets on that chip. For example, if I reset the chip in software, using CMSIS NVIC_SystemReset()
, the debugger keeps working. Maybe it's not a breaking connection - that was just my ad-hoc interpretation. Maybe something different is happening that makes pyocd choke, leading to the invalid hex thing. I'm ready to collect more data or do other tests if that helps (I guess you can't reproduce this without that particular chip).
When I get a chance, I'll check the SAM4L reference manual to see what it says about the reset hierarchy. I'll also dig into the logs you supplied, and try to think of additional data to gather. Thanks providing such detailed info!
I just encountered this same issue on an nRF5340, and figured out what the root cause is. The short answer is that the core is not halted when gdb connected.
When gdb first connects to a target, it requests the target status (?
command with T
stop reply packet response). For Cortex-M, this means that the core must be halted since you can only read core registers in that state. When pyocd can't read a core register when gdb asks for it, it returns a "not available" value of all "x"s instead of hex digits. Normally this is a good thing, since gdb reports to the user that the register's value is not currently available. However, the code in gdb that parses the very first stop reply packet (of course it's not the same as for other stop reply packets…) doesn't handle the "not available" value. So when it tries to parse the register's value in hex, it bails when it sees an x
character.
In the trace you posted above, it's packet 51 that is the culprit:
$T1107:xxxxxxxx;0d:xxxxxxxx;0e:xxxxxxxx;0f:xxxxxxxx;thread:1;#ce
The reason the error isn't reported until the qOffsets
command is (I believe) that gdb caches the command replies on connect and then parses multiple replies at once.
So, there are two things to do here:
(In my case with the nRF5340, the second M33 core fails to halt for some reason that I haven't had time to investigate yet.)
- Figure out why your SAM4L is not halted when gdb connects. Note that pyocd will always attempt to halt the target when the gdbserver starts up.
Remember this issue is caused by a watchdog reset (at least in my case).
So the sequence probably is:
1) I start pyocd gdbserver
, it halts the target
2) watchdog bites and resets the target
3) I start gdb
, it connects and requests status, after target got kicked out of halt state by watchdog, et voilà ;)
FWIW, I can reproduce with OpenOCD and pyOCD, so GDB very probably is the culprit.
In my case, it runs for a while and randomly stops at some point, with the following error and log:
(This is happening inside a custom python function).
As you can see, the traceback and the error have nothing to do with the error, so this very probably happens async.
Traceback (most recent call last):
File "../cortex-profiler/remote-profiler.py", line 335, in invoke
function_writer.process(backtrace)
^^^^^^^^^^^^^^^^^^^^^^^^
File "../cortex-profiler/remote-profiler.py", line 184, in sample
if key not in self.countmap:
^^^^^^^^^^
File "../cortex-profiler/remote-profiler.py", line 133, in get_frame
print("\nStart traceback")
^
gdb.error: Invalid hex digit 116
I can't get
pyocd gdbserver
to work. When I try to connect witharm-none-eabi-gdb
, gdb saysReply contains invalid hex digit 120
and drops the connection. Since debugging the same application with openocd works, I assume the problem is somewhere in pyocd. The strange thing is that gdb complains about the invalid hex digit when pyocd sendsb'$Text=0;Data=0;Bss=0#04'
which is a constant so it's not clear to see where this invalid value comes from.pyocd version is the current dev version (
5a52ccc
). Full pyocd log is attached. pyocd.log