microsoft / debugpy

An implementation of the Debug Adapter Protocol for Python
https://pypi.org/project/debugpy/
Other
1.83k stars 133 forks source link

When using debugger in remote environment it hangs and never successfully breaks when reaching a breakpoint #1272

Open jmusiel opened 1 year ago

jmusiel commented 1 year ago

Before creating a new issue, please check the FAQ to see if your question is answered there.

Environment data

Actual behavior

After launching debugger code runs until a breakpoint, once it reaches a breakpoint the debugger never enters breakpoint state and instead hangs there indefinitely.

This only affects the debugger in the remote instance, it works just fine on my local ubuntu install in WSL.

Also this behavior seems intermittent, sometimes if I run "pkill node" to kill the remote vscode server it the debugger will work fine the next time I run it, but never again after I close it.

Expected behavior

Debugger should enter breakpoint state without me having to kill the remote vscode server and reconnect.

Steps to reproduce:

  1. Attach to remote container
  2. Set this debug configuration: "configurations": [ { "name": "Python: Current File", "type": "python", "request": "launch", "program": "${file}", "cwd": "${fileDirname}", "justMyCode": false, "console": "integratedTerminal", "args": [] }]
  3. Run the debugger with a breakpoint
jmusiel commented 1 year ago

Also when I add the line "logToFile": true to my debugger configuration and try to reproduce it I get the following log file (which I don't see anything weird in, but I am pretty unfamiliar with these): debugpy.zip

jmusiel commented 1 year ago

One additional thing I notice, it may not be relevant, but often when I try to kill the debugger using the red square button in the gui element of VScode (while it's running and hanging as I've described), it will not immediately quit the first few times and instead VScode will send a timeout notification. Eventually if I press it again it will finally quit.

And while the debugger is hanging like this it seems to disable some of my other extensions, for instance while the debugger is hanging my the Intellisense function suggestion menus will stop appearing until I successfully kill the debugger.

int19h commented 1 year ago

What is the local OS running the VSCode client in this scenario?

jmusiel commented 1 year ago

Windows 10 is the local OS, but like I said: it works just fine when connected to my local WSL, so it doesn't really seem to be a local issue.

int19h commented 1 year ago

"clientOS":"unix" in the log threw me off, but I realize now that it is correct in this scenario since the debugger itself is running entirely on the server side (and ditto for WSL).

It would be very interesting to see the pydevd logs. It's not clear to me why "logToFile" didn't enable those - it should, normally, but this is implemented on VSCode side and that might be hitting some issue there. Try setting DEBUGPY_LOG_DIR environment variable via "env" in your debug config instead and see if that produces debugpy.server*.log and debugpy.pydevd*.log? The latter is the one that would have the most information to help diagnose this.

jmusiel commented 1 year ago

Oh my mistake I think I just missed those log files for some reason, here is all 4 log files. debugpy.zip

int19h commented 1 year ago

Very interesting. The debug server did detect the breakpoint correctly, and sent the corresponding message to the adapter; however, the adapter never received it (or any further message from the client). The connection between the two is TCP localhost-to-localhost; is there anything specific to your remote environment that might affect it, e.g. by silently terminating the connection?

jmusiel commented 1 year ago

I did eventually manually terminate this session, but that was after waiting 10-20 minutes past the point that I know it had reached the breakpoint (I had various print statements in my script letting me know where it had reached).

I'm not sure what you mean by anything specific to my remote environment that might affect it, but nothing comes to mind. It's basically just Ubuntu in a docker container running in kubernetes.

The odd thing is that this behavior has only begun happening in the last few months, and pretty much the same setup functioned normally all of last year.

int19h commented 1 year ago

I mean things like local network configuration; basically any OS policy that might prematurely terminate TCP connections.

jmusiel commented 1 year ago

I can't think of anything, but I'm not super familiar with TCP so I can't say for sure. But these are basically fresh docker containers running Ubuntu with a few python packages and cuda installed. Not a lot else going on.

jmusiel commented 1 year ago

Is there anything that might cause the debugger to run extremely slowly when sending or receiving this signal? Like anything obvious I could be looking out for in the process monitor?

jmusiel commented 1 year ago

Is there anything else I could do to test/log anything related to this issue?

jmusiel commented 1 year ago

I have noticed that in some cases, the debugger does seem to finally attach after a very long wait. I made a simple script that just reads a file and prints a message, that takes a couple seconds to run normally, and it attached after about a 10 minute wait.

In the past I have waited much longer than 10 minutes and it never attached for more complex code, but perhaps in these cases it is also just slow but it would take hours instead?

If it isn't actually stuck, but is just extremely slow to attach, is there any way to examine that behavior and figure out what is causing the issue? Once it has attached it seems like I can skip to different breakpoints no problem, it's just an issue when it tries to attach for the first time each run.

int19h commented 1 year ago

It would seem to imply that either the debugger helper threads and/or processes don't get enough execution time to keep up with the debuggee, or network connections are throttled somehow and this is just how long it takes for one to go through. I'm afraid this is still too vague to think of any specific ways to debug it; but if it is indeed a connectivity issue, it might be possible to at least confirm it by running some simple code that just tries to connect from one process to another using TCP - if that turns out to have a similar slowdown, it might be easier to narrow the root cause down from there.

One other thing that I didn't notice earlier - you mentioned that the target environment is Docker. Are you setting that up for debugging manually, or are you using the Docker extension (https://github.com/microsoft/vscode-docker)? The latter does some complicated tricks of its own to manage debugpy connection, so that would be one other possible failure point.