kubernetes-client / java

Official Java client library for kubernetes
http://kubernetes.io/
Apache License 2.0
3.59k stars 1.92k forks source link

Port forward of java debug port results in immediate server close of web socket connection #234

Closed spotty-dog closed 5 years ago

spotty-dog commented 6 years ago

Hi All,

we hava a java app running in k8s which we manage using a spring boot based webapp running on a local machine. The management webapp talks to k8s using this java api.

I'm trying to connect to the remote app using the API port forward functionaility for debugging purposes. The idea is that the remote app has the java debug port open and our users can attach their Eclipse debugger to the local socket.

I've been testing this locally using minikube and what I'm seeing (with the help of some additional logging) is the server is sending a close (1000) before we even attach to the local socket. It sends the initial two bytes (the little-endian port number according to the javadoc) but then immediately closes.

I've tweaked the port forward example to use our ports and that gives the same result.

However, If I attempt the same thing using kubectl port-forward command then it works.

I've got limited knowledge of websockets but I did see some discussion of the java api being different to the kubectl version due to use of GET instead of POST. Could my issue be related to that is some way?

brendandburns commented 6 years ago

Maybe? It seems really unlikely, as I think that the call handlers in the API server are the same...

However, we do use the WebSockets based API for port-forwarding instead of the SPDY which might be causing the problem.

Does port-forwarding work for any other ports for you? I know that it works in general for my servers.

What version of Kubernetes is this?

spotty-dog commented 6 years ago

I'm testing against k8s v1.9.4 (minikube) and v1.10.0 (remote cluster provisioned using kubeadm).

I've tested basic connectivity on both versions. I ran ncat in the target container and locally to connect to the server via the port forward. That works fine on both ends, everything I type shows up on the other side.

The java remote debug test fails for both versions.

Is there some logging I can enable in k8s to help with debugging?

brendandburns commented 6 years ago

Can you send repro instructions? I will try to repro.

tcpdump inside the minikube machine would also be interesting..

--brendan


From: spotty-dog notifications@github.com Sent: Tuesday, April 24, 2018 6:00:38 AM To: kubernetes-client/java Cc: Brendan Burns; Comment Subject: Re: [kubernetes-client/java] Port forward of java debug port results in immediate server close of web socket connection (#234)

I'm testing against k8s v1.9.4 (minikube) and v1.10.0 (remote cluster provisioned using kubeadm).

I've tested basic connectivity on both versions. I ran ncat in the target container and locally to connect to the server via the port forward. That works fine on both ends, everything I type shows up on the other side.

The java remote debug test fails for both versions.

Is there some logging I can enable in k8s to help with debugging?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkubernetes-client%2Fjava%2Fissues%2F234%23issuecomment-383921327&data=02%7C01%7Cbburns%40microsoft.com%7C4589a69968a84ee196d508d5a9e360a3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636601716405429088&sdata=5GKKrsr2BSh8WyjACSQmvcr8m4tro1TpP9%2BxdFtQscQ%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFfDgqL3JbupcK056zCqpFtwylZS8sJTks5tryH2gaJpZM4TMv3w&data=02%7C01%7Cbburns%40microsoft.com%7C4589a69968a84ee196d508d5a9e360a3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636601716405439092&sdata=U7l0eZdmPPI4ho%2BJwhpwg8wtyiWCqbDYpjfrvajvf48%3D&reserved=0.

spotty-dog commented 6 years ago

Sorry for the delay on this, I've been working on other things and this was pushed to the back of the queue.

First of all, reproducing the problem should be possible with any Java app running on a similar JRE. We're using openjdk:8-jdk-alpine as the base for our container image. Here's the Java version:.

bash-4.4$ java -version
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (IcedTea 3.6.0) (Alpine 8.151.12-r0)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)
bash-4.4$

We use these JRE parameters to enable remote debugging:

-Xnoagent -Xrunjdwp:transport=dt_socket,server=y,address=<port-of-your-choice>,suspend=n

Unfortunately I can't share my own test setup as our code isn't publicly available.

On to the tcpdump suggestion. I've added a sidecar container in the pod (tcpdump on alpine) as our main container doesn't have sufficient privileges for network capture.

For the test, I made two attempts at remote debugging through the API (now using 1.0.0 release) and saw the same thing both times. The TCP connection is initialised but after two seconds the JRE gets fed up waiting for the JDWP handshake [1] and closes the connection.

Here's the summary of the first attempt after loading the pcap file into wireshark:

11828   290.039214  127.0.0.1   127.0.0.1   TCP 76  50964 → 19770 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=24507639 TSecr=0 WS=128
11829   290.039228  127.0.0.1   127.0.0.1   TCP 76  19770 → 50964 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=24507639 TSecr=24507639 WS=128
11830   290.039238  127.0.0.1   127.0.0.1   TCP 68  50964 → 19770 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=24507639 TSecr=24507639
11831   292.043242  127.0.0.1   127.0.0.1   TCP 68  19770 → 50964 [FIN, ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=24509643 TSecr=24507639
11832   292.044001  127.0.0.1   127.0.0.1   TCP 68  50964 → 19770 [ACK] Seq=1 Ack=2 Win=43776 Len=0 TSval=24509644 TSecr=24509643
11833   292.543556  127.0.0.1   127.0.0.1   TCP 68  50964 → 19770 [FIN, ACK] Seq=1 Ack=2 Win=43776 Len=0 TSval=24510144 TSecr=24509643
11834   292.543568  127.0.0.1   127.0.0.1   TCP 68  19770 → 50964 [ACK] Seq=2 Ack=2 Win=43776 Len=0 TSval=24510144 TSecr=24510144

I then ran another test using kubectl port-forward and this time the JDWP handshake takes place immediately and the debug connection is stable. Here's the wireshark summary:

17295   964.972810  127.0.0.1   127.0.0.1   TCP 76  52406 → 19770 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=25182572 TSecr=0 WS=128
17296   964.972819  127.0.0.1   127.0.0.1   TCP 76  19770 → 52406 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=25182572 TSecr=25182572 WS=128
17297   964.972827  127.0.0.1   127.0.0.1   TCP 68  52406 → 19770 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=25182572 TSecr=25182572
17298   964.972889  127.0.0.1   127.0.0.1   TCP 82  52406 → 19770 [PSH, ACK] Seq=1 Ack=1 Win=43776 Len=14 TSval=25182572 TSecr=25182572
17299   964.972895  127.0.0.1   127.0.0.1   TCP 68  19770 → 52406 [ACK] Seq=1 Ack=15 Win=43776 Len=0 TSval=25182572 TSecr=25182572
17300   964.972936  127.0.0.1   127.0.0.1   TCP 82  19770 → 52406 [PSH, ACK] Seq=1 Ack=15 Win=43776 Len=14 TSval=25182573 TSecr=25182572
17301   964.972943  127.0.0.1   127.0.0.1   TCP 68  52406 → 19770 [ACK] Seq=15 Ack=15 Win=43776 Len=0 TSval=25182573 TSecr=25182573
...etc (lots of [PSH, ACK] from both sides)

Both of those initial [PSH, ACK] packets have JDWP-Handshake in their payload as required by the Java debug protocol.

Based on that, I'm guessing that the JDWP protocol is time sensitive in the handshake phase and a delay in sending the data is causing the JRE to kill the connection.

I'll run the test again using the PortForwardExample class in the kubernetes-client/java/examples folder to get some confidence that it's not just my code getting in the way. I thought I'd share some initial results though.

[1] https://docs.oracle.com/javase/8/docs/technotes/guides/jpda/jdwp-spec.html

spotty-dog commented 6 years ago

Going to test this again with TCP_NODELAY enabled.

spotty-dog commented 6 years ago

That didn't help, it seems the remote connection is closed before the debugger can connect.

brendandburns commented 5 years ago

Closing this as obsolete, since I think it's something about the Java debugger. Feel free to re-open with more details if you need more help here.

spotty-dog commented 5 years ago

Possibly related to this issue, we added a terminal to our management app (basically a copy of what's in k8s dashboard) but found that it was quite laggy. The fix for that was to add a flush() call immediately after this line:

https://github.com/kubernetes-client/java/blob/master/util/src/main/java/io/kubernetes/client/util/WebSocketStreamHandler.java#L88

From memory, it's something to do with Java Piped streams waiting for up to 1 second for their internal buffer to fill. We're carrying a small patch locally for that, I can submit a PR if it would be of interest.

That's a separate issue of course but I'm wondering if that might also fix this. Seems to fit with the Port Forward example not working either as they would both use the underlying websocket handler code.

Will re-open as suggested....if I ever get around to testing it.