omrikiei / ktunnel

A cli that exposes your local resources to kubernetes
GNU General Public License v3.0
958 stars 75 forks source link

Tunnel established but does not respond or shows high CPU usage #14

Closed mrkswrnr closed 4 years ago

mrkswrnr commented 4 years ago

Hi @omrikiei, it's me again 🙈

I am currently trying to forward a simple local HTTP server into the cluster. Service, deployment and kTunnel (28688 -> 28688) are always established correctly and do not show any errors. Nevertheless the ktunnel server very often does not forward any requests to the local ktunnel client. Requests end in a timeout, although no errors can be seen.

Steps to reproduce

  1. download latest (brew) binary 1.1.10 for macos
  2. execute ktunnel -v expose myapp 8080:8080
  3. see:
    INFO[0000] Exposed service's cluster ip is: 10.99.192.110 
    INFO[0000] waiting for deployment to be ready           
    INFO[0003] Waiting for port forward to finish           
    INFO[0004] Forwarding from 127.0.0.1:28688 -> 28688
    Forwarding from [::1]:28688 -> 28688 
    INFO[0004] starting tcp tunnel from source 8080 to target 8080 
  4. Perform a http request to service inside kubernetes (other) pod via curl -v http://myapp.default.svc.cluster.local:8080/ (ip gets resolved to exposed service cluster ip) no response, no incoming request to local server, no session created)
  5. CTRL+C on client side
    INFO[0027] Got exit signal, closing client tunnels and removing k8s objects 
    INFO[0027] Deleting service travellerapp                
    WARN[0027] error reading from stream: %vrpc error: code = Canceled desc = grpc: the client connection is closing 
    INFO[0027] Deleting deployment myapp  

kubectl version

Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T14:25:20Z", GoVersion:"go1.12.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.4", GitCommit:"67d2fcf276fcd9cf743ad4be9a9ef5828adc082f", GitTreeState:"clean", BuildDate:"2019-09-18T14:41:55Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}

In a few cases the server forwards the request to the client and the HTTP request is successfully answered. After that the CPU load of the (local) ktunnel client increases to 500-600% (6 cores). The client still responds (but no further tunnel requests are forwarded), but the CPU usage does not decrease.

The cpu usage increases after the client logs

INFO[0005] b638fa36-d3ad-4cfd-8881-38b5cf9a8cca; new session; connecting to port 8080 

Please let me know how I can help you to reproduce the bug. Thank you! Markus

omrikiei commented 4 years ago

Hi, weird question, are you locally listening on port 8080? I'll try to look further into that during the weekend, again thanks for catching this!

mrkswrnr commented 4 years ago

Yes, the local HTTP server listens on 0.0.0.0:8080. curl -v http://localhost:8080 or curl -v http://0.0.0.0:8080 (locally) works as expected.

Thanks for taking a look! I know that the fact that it sometimes works makes it difficult to reproduce the error. I'm digging through the source code a little bit right now. Maybe I can provide some more information.

mrkswrnr commented 4 years ago

Some more information: I can reproduce both cases locally (without involving Docker or Kubernetes which eliminates some error sources).

I modified the logger to use FullTimestamp to make it easier to follow the outputs.

Steps to reproduce

  1. Start a http via e.g. via go run github.com/m3ng9i/ran -p 8080
  2. Start the server via: ktunnel server -v
  3. Start the client via: ktunnel client -v --host 127.0.0.1 -s tcp 8081:8080
  4. Make http requests via: curl -v http://127.0.0.1:8081 (one at a time, until no response is being received)
  5. Stop curl
  6. Stop client
  7. Stop server

Log output: No response from tunnel

Server

INFO[2020-02-13 21:38:33.885Z] Starting to listen on port 28688             
INFO[2020-02-13 21:38:43.488Z] Opening TCP connection on port 8081          
DEBU[2020-02-13 21:38:53.031Z] Accepted new connection &{{0xc000396200}}; <nil> 
DEBU[2020-02-13 21:38:53.031Z] got new session 4ad76f65-8e98-4be5-90a6-d21deb31380e 
DEBU[2020-02-13 21:38:53.031Z] read GET / HTTP/1.1
Host: 127.0.0.1:8081
User-Agent: curl/7.64.1
Accept: */*

 from socket 
ERRO[2020-02-13 21:39:34.150Z] failed receiving message from stream, exiting: rpc error: code = Canceled desc = context canceled 
INFO[2020-02-13 21:39:34.150Z] Closing connection on port 8081              
DEBU[2020-02-13 21:39:34.150Z] Accepted new connection <nil>; accept tcp [::]:8081: use of closed network connection 

Client

INFO[2020-02-13 21:38:43.486Z] Verbose logging                              
INFO[2020-02-13 21:38:43.487Z] starting tcp tunnel from source 8081 to target 8080 

Log output: High cpu usage

Server

DEBU[2020-02-13 21:45:47.848Z] Accepted new connection &{{0xc000336480}}; <nil> 
DEBU[2020-02-13 21:45:47.848Z] got new session 09838cee-bde3-4621-828d-320787b8a987 
DEBU[2020-02-13 21:45:47.849Z] read GET / HTTP/1.1
Host: 127.0.0.1:8081
User-Agent: curl/7.64.1
Accept: */*

 from socket 
DEBU[2020-02-13 21:45:49.598Z] Accepted new connection &{{0xc000336600}}; <nil> 
DEBU[2020-02-13 21:45:49.598Z] got new session fd06f4f1-7f2a-4820-8abc-5b1634aa71e3 
DEBU[2020-02-13 21:45:49.598Z] read GET / HTTP/1.1
Host: 127.0.0.1:8081
User-Agent: curl/7.64.1
Accept: */*

 from socket 
DEBU[2020-02-13 21:45:55.191Z] Accepted new connection &{{0xc000336700}}; <nil> 
DEBU[2020-02-13 21:45:55.191Z] got new session 390a4840-b1a0-4e88-9f0e-29af72fd93de 
DEBU[2020-02-13 21:46:00.331Z] Accepted new connection &{{0xc000336780}}; <nil> 
DEBU[2020-02-13 21:46:00.331Z] got new session f69a1866-ab24-4cc6-8f73-bf076c097411 
DEBU[2020-02-13 21:46:04.778Z] Accepted new connection &{{0xc000336800}}; <nil> 
DEBU[2020-02-13 21:46:04.778Z] got new session ff449321-2186-4a97-a967-91e45f47571e 
ERRO[2020-02-13 21:46:44.817Z] failed receiving message from stream, exiting: rpc error: code = Canceled desc = context canceled 
INFO[2020-02-13 21:46:44.817Z] Closing connection on port 8081              
DEBU[2020-02-13 21:46:44.818Z] Accepted new connection <nil>; accept tcp [::]:8081: use of closed network connection 

Client

DEBU[2020-02-13 21:45:42.276Z] b1fc863f-a650-45b3-94ce-1e8e232df8af; got session from server: GET / HTTP/1.1
Host: 127.0.0.1:8081
User-Agent: curl/7.64.1
Accept: */*

DEBU[2020-02-13 21:45:44.491Z] 9637e6dc-fba2-4381-81af-53aa579c95f6; got session from server:  
INFO[2020-02-13 21:45:44.491Z] 9637e6dc-fba2-4381-81af-53aa579c95f6; new session; connecting to port 8080 
DEBU[2020-02-13 21:45:44.492Z] started reading from session 9637e6dc-fba2-4381-81af-53aa579c95f6 
DEBU[2020-02-13 21:45:44.492Z] 9637e6dc-fba2-4381-81af-53aa579c95f6; got session from server: GET / HTTP/1.1
Host: 127.0.0.1:8081
User-Agent: curl/7.64.1
Accept: */*

DEBU[2020-02-13 21:45:46.464Z] 961ebdb5-269d-40e9-b847-87d33f701a85; got session from server:  
INFO[2020-02-13 21:45:46.464Z] 961ebdb5-269d-40e9-b847-87d33f701a85; new session; connecting to port 8080 
DEBU[2020-02-13 21:45:46.466Z] started reading from session 961ebdb5-269d-40e9-b847-87d33f701a85 
DEBU[2020-02-13 21:45:46.466Z] 961ebdb5-269d-40e9-b847-87d33f701a85; got session from server: GET / HTTP/1.1
Host: 127.0.0.1:8081
User-Agent: curl/7.64.1
Accept: */*

DEBU[2020-02-13 21:45:47.849Z] 09838cee-bde3-4621-828d-320787b8a987; got session from server:  
INFO[2020-02-13 21:45:47.849Z] 09838cee-bde3-4621-828d-320787b8a987; new session; connecting to port 8080 
DEBU[2020-02-13 21:45:47.850Z] started reading from session 09838cee-bde3-4621-828d-320787b8a987 
DEBU[2020-02-13 21:45:47.850Z] 09838cee-bde3-4621-828d-320787b8a987; got session from server: GET / HTTP/1.1
Host: 127.0.0.1:8081
User-Agent: curl/7.64.1
Accept: */*

Guessing

The cpu usage of the server process increases after the first successful request. The problem seems to be inside the loop in Server.InitTunnel. connection.Read fails with err == io.EOF so break is not being called.

Currently I have no clue why sometimes the server does not forward requests (made to port 8081) to the client.

omrikiei commented 4 years ago

Hey, can you check v1.2.0? I made major improvements and found several deadlocks in mutexes that are supposedly fixed now...

mrkswrnr commented 4 years ago

The tunnel is now always established reliably and there is no more heavy CPU usage. However, there now seem to be problems with closing connections and reusing sockets (http keep-alive connections). I would create another issue about this problem as soon as I can describe a reconstructable process :see_no_evil: This issue seems to be fixed though. Thanks a lot!

omrikiei commented 4 years ago

I'll have a look on how to make those connections terminate sooner. Thanks for the valuable feedback. Also keeping this issue open for the time being..

mrkswrnr commented 4 years ago

I think another problem that concerns me is that within a session, between two writes, the buffer is not cleared. As a result, parts of the previous (longer) data packet are partly still contained in the following packet.

You could shorten the buff to br here, I think. I would prepare a pull request for it, if you like.

omrikiei commented 4 years ago

Yes! nice catch!

omrikiei commented 4 years ago

Hi @mrkswrnr, a fix is pushed in v1.2.1.. For future reference - I would love for people to contribute to this project.

mrkswrnr commented 4 years ago

Sorry for not opening the Pull Request, but despite the fix it still seems to send few packets incorrectly and I wanted to check that first. But the error seems to occur only with longer server responses. I am still trying to narrow it down.

omrikiei commented 4 years ago

Buffer size was reset to 4096 in this version - so it could be related... will try to check it tonight

omrikiei commented 4 years ago

Hey @mrkswrnr, just discovered a possible issue with stream persistence, I made a pre-release of v1.2.3-alpha - could you see id that version resolves the issue you've described?

omrikiei commented 4 years ago

closing this for now, kindly re-open if anything reappears...