omrikiei / ktunnel

A cli that exposes your local resources to kubernetes
GNU General Public License v3.0
925 stars 71 forks source link

PyCharm debugger waiting for process connection #60

Closed ahlae closed 9 months ago

ahlae commented 2 years ago

After starting the PyCharm debugger with IDE host name 0.0.0.0, port 5678 and correct path mappings, then setting up the tunnel and starting the remote application with

import pydevd_pycharm pydevd_pycharm.settrace('localhost', port=5678, stdoutToServer=True, stderrToServer=True)

in the source code (using pip install pydevd-pycharm~=211.7628.24 as the PyCharm run/debug configuration tells me) the PyCharm debugger doesn't start the debugging process properly. It keeps showing

Waiting for process connection...

When stopping the remote application with CTRL-C the stdout/stderr is nonetheless transmitted to PyCharm and appears in the debug console:

Traceback (most recent call last): File "/opt/app-root/./main.py", line 9, in pydevd_pycharm.settrace('127.0.0.1', port=5678, stdoutToServer=True, stderrToServer=True, suspend=False) File "/home/precisionlims/.local/lib/python3.9/site-packages/pydevd.py", line 1689, in settrace _locked_settrace() Waiting for process connection... Server stopped.

The tunnel is logging:

INFO[2021-07-16 11:08:45.939] new connection port=5678 session=fddfa455-d934-4fd2-8588-aed7d5e48631 INFO[2021-07-16 11:08:45.970] closed session session=fddfa455-d934-4fd2-8588-aed7d5e48631

Like if the connection was opened and closed at the same time.

So there is a successful connection attempt made. What might be the issue here?

omrikiei commented 2 years ago

Hi @ahlae - could you reproduce this issue with verbose mode in ktunnel so that debug logs are also seen? I'm assuming that the pydevd version you installed is the one suggested by your pycharm version, and that 'local' remote debugging works

ahlae commented 2 years ago

Hi @omrikiei, thank you for your fast response. <3

Yes, 'local' remote debugging works. I launch a docker container running the app, install the pydevd_pycharm package and the connection back to PyCharm works without issues. The PyCharm Debug console shows me:

Starting debug server at port 5,678 Use the following code to connect to the debugger: import pydevd_pycharm pydevd_pycharm.settrace('0.0.0.0', port=5678, stdoutToServer=True, stderrToServer=True) Waiting for process connection... Connected to pydev debugger (build 211.7628.24) Hi, PyCharm Waiting for process connection...

The command I use to install the package in the local container as well as in the cluster is: pip install pydevd-pycharm~=211.7628.24

This pydevd version is what I'm told to install by PyCharm in the Run/Debug configuration. I'm using the following PyCharm version:

PyCharm 2021.1.3 (Professional Edition) Build #PY-211.7628.24, built on June 30, 2021 Runtime version: 11.0.11+9-b1341.60 amd64 VM: OpenJDK 64-Bit Server VM by JetBrains s.r.o. Linux 5.12.14-arch1-1 Registry: debugger.watches.in.variables=false Non-Bundled Plugins: com.google.gct.core (21.6.1) Current Desktop: GNOME

The app image I'm trying to debug on kubernetes is based on alpine:latest.

When I use ktunnel with verbose mode on it outputs:

ktunnel inject deployment -n ... ... 5678 --verbose INFO[0000] Injecting tunnel sidecar to ... INFO[0000] ProgressDeadlineInSeconds is currently 600s. It may take this long to detect a deployment failure. INFO[0000] Waiting for deployment to be ready
INFO[0030] deployment "..." successfully rolled out INFO[0030] port forwarding to ... INFO[0030] Waiting for port forward to finish
INFO[0030] Forwarding from 127.0.0.1:28688 -> 28688 Forwarding from [::1]:28688 -> 28688 INFO[2021-07-17 11:07:53.950] starting tcp tunnel from source 5678 to target 5678 DEBU[2021-07-17 11:07:54.024] attempting to receive from stream
INFO[2021-07-17 11:10:26.378] new connection port=5678 session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.379] received 208 bytes from server session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.379] wrote 208 bytes to conn session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.379] started reading conn session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.379] attempting to receive from stream
DEBU[2021-07-17 11:10:26.380] read 15 bytes from conn error="" session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.380] received 0 bytes from server session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.380] attempting to receive from stream
DEBU[2021-07-17 11:10:26.380] wrote 15 bytes to session buf error="" session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.380] received 40 bytes from server session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.380] read 15 from buffer out of 15 available session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.380] sending 15 bytes to server close=false session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.380] sent 15 bytes to server close=false session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.380] wrote 40 bytes to conn session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.380] attempting to receive from stream
DEBU[2021-07-17 11:10:26.382] received 214 bytes from server session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.382] wrote 214 bytes to conn session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.382] attempting to receive from stream
DEBU[2021-07-17 11:10:26.382] received 0 bytes from server session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.382] attempting to receive from stream
DEBU[2021-07-17 11:10:26.382] received 155 bytes from server session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.382] wrote 155 bytes to conn session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.382] attempting to receive from stream
INFO[2021-07-17 11:10:26.411] closed session session=872c6352-b9d9-4168-b949-53e08e273fc4 DEBU[2021-07-17 11:10:26.411] attempting to receive from stream

When starting ktunnel it sits waiting at

DEBU[2021-07-17 11:07:54.024] attempting to receive from stream

The stuff beginning with "INFO[2021-07-17 11:10:26.378] new connection" only shows up when interrupting the app with CTRL-C.

I thought it might be a problem with Python3 in the container compiled by a different GCC compiler

Python 3.9.5 (default, May 12 2021, 20:44:22) [GCC 10.3.1 20210424] on linux

and on my laptop

Python 3.9.5 (default, May 24 2021, 12:50:35) [GCC 11.1.0] on linux

But the local container works.

The connection to the cluster is routed through Wireguard VPN since it is a private cluster. Sending TLS traffic through a VPN connection sometimes create issues with packet size and fragmentation. To remediate this the MTU is set to 1380. kubectl proxy, ksync, skaffold and telepresence are working perfectly fine.

Thanks again for your time! :)

omrikiei commented 2 years ago

Hi @ahlae - one more question: How is remote debugging triggered on the application? is it during application init?

ahlae commented 2 years ago

I tried different options. Since I'm setting this debugging workflow up for the first time I put the settrace after importing the pydevd module during application init. Then I tried different positions in functions or 'main'. All with the same result as described above.

omrikiei commented 2 years ago

Well, it's very likely that the app is being initialized before the tunnel is established. There are two things I suggest trying:

  1. Trigger set_trace from a web endpoint instead of from init
  2. If you must do it from init, set a 30 seconds delay
ahlae commented 2 years ago

Hi @omrikiei - I can explain my workflow for setting up the debugging session because I think the tunnel is readily established when starting the app:

Because I wanted to test things out before building it into the real app I'm using the PyCharm python project example which has only one file called main.py:

# This is a sample Python script.

# Press Shift+F10 to execute it or replace it with your code.
# Press Double Shift to search everywhere for classes, files, tool windows, actions, and settings.

import pydevd_pycharm
pydevd_pycharm.settrace('localhost', port=5678, stdoutToServer=True, stderrToServer=True)

def print_hi(name):
    # Use a breakpoint in the code line below to debug your script.
    #import pydevd_pycharm
    #pydevd_pycharm.settrace('localhost', port=5678, stdoutToServer=True, stderrToServer=True)
    print(f'Hi, {name}')  # Press Ctrl+F8 to toggle the breakpoint.

# Press the green button in the gutter to run the script.
if __name__ == '__main__':
    #import pydevd_pycharm
    #pydevd_pycharm.settrace('localhost', port=5678, stdoutToServer=True, stderrToServer=True)
    print_hi('PyCharm')

# See PyCharm help at https://www.jetbrains.com/help/pycharm/

I follow the subsequent steps:

  1. I start the debugger in PyCharm by starting a Python Debug Server with the already mentioned settings until it shows

Waiting for process connection...

  1. After that I run in the command console ktunnel inject deployment -n ... ... 5678 --verbose
  2. I watch the deployment on Kubernetes which is getting injected until everything is running again and the ktunnel shows

DEBU[2021-07-17 11:07:54.024] attempting to receive from stream

  1. Then I copy over main.py into the respective pod in the deployment
  2. I exec into the pod and run pip install pydevd-pycharm~=211.7628.24
  3. I run netstat and check if it's showing me port 5678 as available
  4. Then I run python3 ./main.py
  5. PyCharm doesn't respond until I CTRL-C. Then PyCharm shows the stdout of the python3 ./main.py process

I think your recommendation to insert a 30 seconds delay is already respected while following these steps. As a last resort I'm going to try your first recommendation and setup the debugger in the real app with a web endpoint.

ahlae commented 2 years ago

Sadly there seems to be no chance for me to get this running. I tried a web endpoint in the application which produces the same results as described above. I even installed an Ubuntu Desktop into the cluster network (to be able to use PyCharm) to rule out an issue with the VPN connection but without success.

Btw I'm running this on EKS with Kubernetes 1.19.

omrikiei commented 2 years ago

Hi @ahlae, have you tried using the expose subcommand instead of inject? then in set_trace you should input the exposed IP or service name that was created. additionally could you try sending the suspend=False argument as seen in the example?

ahlae commented 2 years ago

Hi @omrikiei - I tried what you proposed, expose and suspend=False. Always the same result:

Starting debug server at port 5,678 Use the following code to connect to the debugger: import pydevd_pycharm pydevd_pycharm.settrace('0.0.0.0', port=5678, stdoutToServer=True, stderrToServer=True) Waiting for process connection... Traceback (most recent call last): File "/opt/app-root/./main.py", line 9, in pydevd_pycharm.settrace('pycharm', port=5678, stdoutToServer=True, stderrToServer=True, suspend=False) File "/home/precisionlims/.local/lib/python3.9/site-packages/pydevd.py", line 1689, in settrace _locked_settrace( File "/home/precisionlims/.local/lib/python3.9/site-packages/pydevd.py", line 1764, in _locked_settrace Waiting for process connection... Server stopped.

I will look into finding some time and setting up a go debugging session. I really want this to get working. Your tool is super useful. :)

omrikiei commented 2 years ago

Hello @ahlae, wanted to know if you've made any progress with a debugging session, and if this issue is still relevant

ahlae commented 2 years ago

Hi @omrikiei,

yes, this is still an issue. Atm I'm working on several issues that popped up and I didn't had time to resolve it. For now we kept developing and debugging locally. If you want to close the issue we can do this and I'll write you a PM when I've finished.

omrikiei commented 2 years ago

@ahlae hi, is this issue still relevant?

ahlae commented 2 years ago

@omrikiei Hi, yes, but I'm too busy to solve it right now. I'm going to close the issue and reopen it when continuing on it.

mbugert commented 2 years ago

For the record, I'm experiencing the same issue with PyCharm 2021.2.3 (i.e., new connection only appears when CTRL+C'ing the to-be-debugged script on the remote machine, but not beforehand).

Apart from trying to tunnel the debugger, I also tried tunneling a locally running flask hello world site into the deployment, and that worked flawlessly. So apparently, the problem could be with pycharm or the EKS cluster.

chmielas commented 2 years ago

I also tested this with pycharm Build #PY-213.5744.248 and the connection to the debugger starting only when pressing CTRL+C in the python console. The problem occurs not only on EKS but also on the local minikube cluster (test was performed with both minikube versions 1.22.2 and 1.15.0)

cedricspaceiq commented 2 years ago

FYI, I'm facing the same issue.

Ahlaee commented 2 years ago

@cedricspaceiq I'm facing the issue on PC. I don't think this is connected to an Apple M1.

BUCKFAE commented 1 year ago

Same issue for me, can't even get the example to run. PyCharm: 2022.2.1

LISTERINE commented 1 year ago

I'm seeing this as well. PyCharm 2022.2.3 (Professional Edition) Build #PY-222.4345.23, built on October 10, 2022

sillyfrog commented 1 year ago

I'm having a similar issue, looking at the verbose output, and testing with nc, I think the issue is the pydevd_pycharm module waits to get data back from the server. However it appears ktunnel does not actually connect until the first packet comes from the "client".

How I have replicated:

TLDR; ktunnel needs to connect to the local machine and start reading when the TCP session is started, not on the first packet. But this could also be an issue with k8s, I'm unsure.

My testing is on AWS EKS with k8s v1.22.

I've also tried the PyCharm example using inject, and the same result (it doesn't actually connect through, and using nc, I needed to initiate traffic from the k8s side).

sillyfrog commented 1 year ago

Further to the above, the logs that come through to the ktunnel client are not as "up to date" as the actual server logs (using kubectl logs), what I have found is the server does acknowledge the new connection when the initial connection is made, but does not actually make the TCP connection to the PC until a pod has sent at least some data.

For example, on the PC I ran: nc -v -l -p 5678, and then entered some data to be sent to the client that connects.

And then on a pod I ran: nc -v test.default.svc.cluster.local 5678, and the ktunnel server logs had:

DEBU[2023-02-05 22:07:07.127] Accepted new connection &{{0xc00048e600}}     error="<nil>"
INFO[2023-02-05 22:07:07.127] new connection                                session=297b6035-d82a-4811-8eb7-1ef0b6d6c6ba

I then sent a single character (and newline) from the pod:

DEBU[2023-02-05 22:07:12.231] read 2 bytes from conn                        error="<nil>"
DEBU[2023-02-05 22:07:12.231] sending 2 bytes to client                     close=false session=297b6035-d82a-4811-8eb7-1ef0b6d6c6ba
DEBU[2023-02-05 22:07:12.231] sent 2 bytes to client                        close=false session=297b6035-d82a-4811-8eb7-1ef0b6d6c6ba
DEBU[2023-02-05 22:07:12.252] received 19 bytes from client                 close=false session=297b6035-d82a-4811-8eb7-1ef0b6d6c6ba
DEBU[2023-02-05 22:07:12.252] writing 19 bytes to conn                      session=297b6035-d82a-4811-8eb7-1ef0b6d6c6ba
DEBU[2023-02-05 22:07:12.252] wrote 19 bytes to conn                        session=297b6035-d82a-4811-8eb7-1ef0b6d6c6ba

So the issue is the same, but it does look like there's an issue in ktunnel where it only attempts to connect through when data first comes in from the pod, not on the initial setup of the TCP connection.

sillyfrog commented 1 year ago

I think this fixes the issue. I'll do more testing and put in a PR once confirmed:

diff --git a/pkg/server/server.go b/pkg/server/server.go
index 266bc49..27495e9 100644
--- a/pkg/server/server.go
+++ b/pkg/server/server.go
@@ -125,6 +125,8 @@ func ReceiveData(conf *Config, stream pb.Tunnel_InitTunnelServer) {

 func readConn(ctx context.Context, conf *Config, session *common.Session, sessions chan<- *common.Session) {
        conf.log.WithField("session", session.Id.String()).Info("new connection")
+       conf.log.Debug("starting session with sessions <- session")
+       sessions <- session

        for {
LISTERINE commented 1 year ago

@sillyfrog thanks for opening a PR. @omrikiei Any chance we can get this merged?