gravitational / teleport

The easiest, and most secure way to access and protect all of your infrastructure.
https://goteleport.com
GNU Affero General Public License v3.0
17.29k stars 1.74k forks source link

Teleport v16.1.0 is not working with MiTM proxy server #44321

Closed clue1ess closed 1 week ago

clue1ess commented 1 month ago

We have deployed Teleport v16.1.0 on an Kubernetes cluster in AWS (EKS) behind an ALB and are using a MiTM proxy server in front of teleport agent through which all the traffic goes (from teleport agent to teleport server).

Expected behavior: Teleport agent should be able to connect to the teleport server via MiTM proxy server.

Current behavior: Teleport agent is able to connect to the teleport server via MiTM proxy server but the connection is dropped abruptly.

Logs : Teleport Agent side -

Original Error: *errors.errorString Failed to connect to Proxy Server through tunnel: connection error: desc = "transport: Error while dialing: failed to dial: context deadline exceeded"
Stack Trace:
    github.com/gravitational/teleport/lib/service/connect.go:1226 github.com/gravitational/teleport/lib/service.(*TeleportProcess).newClient
    github.com/gravitational/teleport/lib/service/connect.go:1126 github.com/gravitational/teleport/lib/service.(*TeleportProcess).getConnector
    github.com/gravitational/teleport/lib/service/connect.go:532 github.com/gravitational/teleport/lib/service.(*TeleportProcess).firstTimeConnect
    github.com/gravitational/teleport/lib/service/connect.go:215 github.com/gravitational/teleport/lib/service.(*TeleportProcess).connect
    github.com/gravitational/teleport/lib/service/connect.go:188 github.com/gravitational/teleport/lib/service.(*TeleportProcess).connectToAuthService
    github.com/gravitational/teleport/lib/service/connect.go:81 github.com/gravitational/teleport/lib/service.(*TeleportProcess).reconnectToAuthService
    github.com/gravitational/teleport/lib/service/service.go:2924 github.com/gravitational/teleport/lib/service.(*TeleportProcess).RegisterWithAuthServer.func1
    github.com/gravitational/teleport/lib/service/supervisor.go:588 github.com/gravitational/teleport/lib/service.(*LocalService).Serve
    github.com/gravitational/teleport/lib/service/supervisor.go:313 github.com/gravitational/teleport/lib/service.(*LocalSupervisor).serve.func1
    runtime/asm_amd64.s:1695 runtime.goexit

Teleport Server side -

2024-07-16T11:24:04Z WARN [WEB]       Failed to write ping message. error:[
ERROR REPORT:
Original Error: *tls.permanentError write tcp *******->*****: write: broken pipe
Stack Trace:
    github.com/gravitational/teleport/lib/web/conn_upgrade.go:329 github.com/gravitational/teleport/lib/web.(*websocketALPNServerConn).writeFrame
    github.com/gravitational/teleport/lib/web/conn_upgrade.go:342 github.com/gravitational/teleport/lib/web.(*websocketALPNServerConn).WritePing
    github.com/gravitational/teleport/lib/web/conn_upgrade.go:190 github.com/gravitational/teleport/lib/web.(*Handler).startPing
    runtime/asm_amd64.s:1695 runtime.goexit
User Message: write tcp *******->******: write: broken pipe] web/conn_upgrade.go:193

Additionally, our MiTM proxy server is able to intercept TLS traffic and supports ALPN. Tried scenarios as below - [Working] 15 Agent -> MiTM Proxy -> 15 server [Not working] 15 Agent -> MiTM Proxy -> 16 server [Not working] 16 Agent -> MiTM Proxy -> 16 server

After investigation, we found that it broke at Teleport Server v15.4.0

Steps to reproduce :

  1. Deploy MiTM proxy using the command - docker run -d --rm -v ~/.mitmproxy:/home/mitmproxy/.mitmproxy -p <private_ip_addr>:8081:8080 mitmproxy/mitmproxy mitmdump --set stream_large_bodies=-1
  2. Deploy Teleport Server Version v15.4.0 on EKS cluster behind an ALB
  3. Deploy Teleport Agent v15.4.0 on another cluster. Establish the connection with teleport server. Make sure cluster on which agent is running should communicate with teleport server via MiTM proxy.
webvictim commented 1 month ago

What MITM proxy are you using?

Can you confirm exactly which v15/v16 versions of Teleport are affected here?

You mentioned 15.0.1 on Slack - can you try server version 15.4.9 (latest v15) and see whether the issue persists?

clue1ess commented 1 month ago

We are using mitmproxy v10.3.1(latest).

v15 and v16 both are affected. It works with v15.0.1 but does not work with v15.4.9 and v16.

webvictim commented 4 weeks ago

Update from Slack

It works for v15.2.0 Looks like it broke somewhere between 15.2.0 and 15.4.9 Tested following scenarios : Works with v15.3.0 Works with v15.3.6 Works with v15.3.7 Does not work with v15.4.0

clue1ess commented 3 weeks ago

@webvictim Updated the description with required details.

zmb3 commented 3 weeks ago

Hmm, I'm not seeing any changes in the diff that look relevant: https://github.com/gravitational/teleport/compare/v15.3.7...v15.4.0

Maybeee https://github.com/gravitational/teleport/pull/42192? @greedy52 WDYT?

greedy52 commented 1 week ago

I can repro this but still debugging the root cause. I have two workarounds ATM before we can patch a proper solution.

Option 1: disable websocket on mitmproxy: mitmdump --set websocket=false

Option 2: on each teleport agent, use the following env var when starting the agent to avoid websocket TELEPORT_TLS_ROUTING_CONN_UPGRADE_MODE=legacy

greedy52 commented 1 week ago

I found the root cause. We did a lousy job masking/unmasking websocket frames. When mitmproxy is interpreting websocket, it does it properly. Will put up the fix soon but hopefully it will catch the patch next week

clue1ess commented 3 days ago

Thanks @greedy52 for fixing this. When can we expect the fix in v16 version ?

greedy52 commented 1 day ago

Thanks @greedy52 for fixing this. When can we expect the fix in v16 version ?

https://github.com/gravitational/teleport/releases/tag/v16.2.1 released yesterday