pyocd / pyOCD

Open source Python library for programming and debugging Arm Cortex-M microcontrollers
https://pyocd.io
Apache License 2.0
1.14k stars 485 forks source link

pyocd does not issue ID_DAP_Disconnect on end of debug session in Eclipse #1531

Open rgrr opened 1 year ago

rgrr commented 1 year ago

if pyocd is invoked via command line (e.g. "pyocd reset" or "pyocd gdb"), then ID_DAP_Disconnect is sent to the probe on end of pyocd.

If pyocd is started by "GDB PyOCD Debugging" in Eclipse, there is no ID_DAP_Disconnect sent to the probe.

Don't know, if this is actually an issue of the Eclipse plugin or of pyocd, but to my opinion pyocd should disconnect from the probe no matter how it is ended.

Problem for e.g. my probe is, that if there is no disconnect, the SWD channel is not freed.

flit commented 1 year ago

Agreed, it should disconnect at the appropriate times. The problem sounds is a side effect of the fact that the pyocd process maintains a connection as long as it is running. So if pyocd's gdbserver subcommand is started with --persist, or gdb disconnects in a way that leaves the gdbserver running, the pyocd process will not exit and therefore won't send the disconnect command.

Changing this seems like it would be quite complicated and possibly require some incompatible/breaking behaviour changes.

flit commented 1 year ago

Does the disconnect command actually get sent when the pyocd process terminates? I was assuming it does, but if not then that's a totally different issue (and presumably easier to fix).

rgrr commented 1 year ago

Sorry for the late response, I was off for a week.

To answer the question: if the plugin is started, "ps x" shows

/home/hardy/.local/bin/python3 /home/hardy/.local/bin/pyocd gdbserver --no-wait -j /home/hardy/src-rg/playground --port 3333 --telnet-port 4444 --uid E6616407E3646B29 --target nrf52840 --frequency 6000000 --connect=halt -Oreset_type=default -Ch --erase=chip -Osmart_flash=1 --semihosting --elf /home/hardy/src-rg/playground/llvm-games/RTT/_build/RTT.elf -O semihost_use_syscalls=false -L*.semihost=debug -Lpyocd.debug.semihost.trace=debug -v -c echo "Started by Eclipse Embedded CDT"

After termination of the debug session (Ctrl-F2 in Eclipse), the process no longer appears in the process list.

flit commented 1 year ago

Thanks. In this case it appears to be a bug. If you get a chance, would you mind attaching/pasting the end of a debug log from pyocd with -vv -Lpyocd.gdbserver.packet_io.trace.packet=debug. Starting from the point after it's successfully connected and halted until the end would be fine.

rgrr commented 1 year ago

no problem:

0004501 D --<<<< GDB send 12 bytes: b'$00000000#80' [packet_io]
0013497 D -->>>> GDB read 14 bytes: b'$vKill;a410#33' [packet_io]
0013497 D --<<<< GDB send 4 bytes: b'$#00' [packet_io]
0013497 D -->>>> GDB read 5 bytes: b'$k#6b' [packet_io]
0013497 D GDB kill [gdbserver]
0013497 D gdbserver exiting connection loop [gdbserver]
0013497 D GDB packet thread: other side closed connection [packet_io]
0013498 D GDB packet thread stopping [packet_io]
0013498 I Client disconnected from port 3333! [gdbserver]
0013498 D gdbserver thread exiting [gdbserver]
0013498 D GDB server cleaning up [gdbserver]
rgrr commented 1 year ago

If starting "pyocd gdbserver -t nrf52840 -vv -Lpyocd.gdbserver.packet_io.trace.packet=debug" on the console and changing the Eclispe debug configuration so that the "external" gdbserver is used, disconnect is transmitted:

0018042 D --<<<< GDB send 52 bytes: b'$353339313031323128bd4afc5dbab649de5f2714685afdbf#e5' [packet_io]
0023701 D -->>>> GDB read 14 bytes: b'$vKill;a410#33' [packet_io]
0023701 D --<<<< GDB send 4 bytes: b'$#00' [packet_io]
0023701 D -->>>> GDB read 5 bytes: b'$k#6b' [packet_io]
0023701 D GDB kill [gdbserver]
0023701 D gdbserver exiting connection loop [gdbserver]
0023701 D GDB packet thread: other side closed connection [packet_io]
0023701 D GDB packet thread stopping [packet_io]
0023702 I Client disconnected from port 3333! [gdbserver]
0023702 D gdbserver thread exiting [gdbserver]
0023702 D GDB server cleaning up [gdbserver]
0024068 I Semihost server stopped [server]
0024124 D uninit session <pyocd.core.session.Session object at 0x7f005e1bb7d0> [session]
0024125 D uninit board <pyocd.board.board.Board object at 0x7f005d392990> [board]
0024126 D resuming core 0 [cortex_m]
0024126 D added=[] removed=[] [manager]
0024126 D bps after flush={} [manager]
hardy@ntbox:
flit commented 1 year ago

Thanks. Definitely not the expected or intended behaviour. I've updated Eclipse Embedded and have it running again. Will reproduce and look into this.

rgrr commented 1 year ago

Can I be of any help with this issue? If you give me some pointer, I could look into it.

flit commented 1 year ago

Thanks for the offer! Let me get back to you shortly. I'm going to release a new version 0.35 today.

flit commented 1 year ago

From the logs, in both cases a vKill command is received and the gdbserver terminates as expected. They both make it to the GDBServer._cleanup() method that logs "GDB server cleaning up". Something happens after that point which causes the usual disconnect to not happen.

One guess is that the gdbserver subcommand has an exception handler for Exception: https://github.com/pyocd/pyOCD/blob/fa0bd0094e57a05bb2d38100e7cc2dd860912dbf/pyocd/subcommands/gdbserver_cmd.py#L236

This will just stop the gdbservers if running, but doesn't log an error or disconnect the board. Could be that an exception is happening of some sort that causes it to skip over the disconnect.

But given that Session will (should) properly disconnect when an exception is raised up through its context handler, it might be that something happens during the start of disconnect. Like in Board.uninit(). See Session.close() for where this starts: https://github.com/pyocd/pyOCD/blob/fa0bd0094e57a05bb2d38100e7cc2dd860912dbf/pyocd/core/session.py#L543

Thanks again if you can help out with this! 🙏🏽

rgrr commented 1 year ago

Ok, first and fast tries: in gdbserver_cmd.py it never reaches if probe_server...

                while any(g.is_alive() for g in gdbs):
                    sleep(0.1)
                if probe_server:
                    probe_server.stop()

in gdbserver.py run() leaves correctly the loop on exit, but _cleanup() has problems:

    def _cleanup(self):
        LOG.debug("GDB server cleaning up")
        if self.packet_io:
            self.packet_io.stop()
            self.packet_io = None
        if self.semihost:
            self.semihost.cleanup()
            self.semihost = None
        if self.telnet_server:
            self.telnet_server.stop()
            self.telnet_server = None
        if self._swv_reader:
            self._swv_reader.stop()
            self._swv_reader = None
        if self.rtt_server:
            self.rtt_server.stop()
            self.rtt_server = None
        self.abstract_socket.cleanup()

self.telnet_server.stop() seems to stop/crash pyocd, an exception handler around self.telnet_server.stop() did not help.

Further investigation showed, that commenting out everything with telnet_server and semihosting makes the bug sometimes(!) disappear. But most of the time, it does not reach if probe_serverfrom above.

Not sure how to interpret this behavior. Do you have any hints? Also how to actually debug pyocd, because I'm currently debugging with print-statements.

flit commented 1 year ago

Sorry I missed your update!

Debugging pyocd

First, I usually debug pyocd using VS Code these days. The Pydev plugins for Eclipse work equally well (I used this env for a long time). You'll just need to create a project/workspace for pyocd.

Since you are trying to debug a process being started by Eclipse, you'll have to start the Python debug server from within pyocd itself. Install debugpy with pip into your environment.

Then add the following code to pyocd. Inside pyocd.__main__.PyOCDTool.run() is a good location, though it can be done at module level in __main__.py.

import debugpy
debugpy.listen(5678)

# optional
debugpy.wait_for_client()

(Note that I haven't actually tried using debugpy like this with the Eclipse Python plugins, but the backends are the same I think.)

The issue

Maybe it's somehow related to Eclipse still being connected to the telnet server, and there's a race condition causing it to be intermittent? Just a semi-random guess. And that shouldn't cause an unexpected exit, anyway.

You don't see a coredump created, do you?

You could try using the faulthandler module. Set the PYTHONFAULTHANDLER environment variable to a non-empty string or call faulthandler.enable() in __main__.py.

rgrr commented 1 year ago

Hi Chris,

sorry for neglecting this issue. Today I took some time to debug further. The actual problem is that the "GDB PyOCD Debugging" of Eclipse feels responsible for terminating pyocd if it has started it. Meaning: the plugin kills pyocd.

Simplest possible solution out of this would be to add a few lines to main.py:

...
import signal
...
def main():
    signal.signal(signal.SIGTERM, signal.SIG_IGN)
    sys.exit(PyOCDTool().run())