kubernetes / ingress-nginx

Ingress-NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.01k stars 8.15k forks source link

TLS passthrough fails if Client Hello is fragmented in multiple TCP packets. #11424

Open Dirbaio opened 1 month ago

Dirbaio commented 1 month ago

Since a few days ago, GitHub sends the Client Hello fragmented in a few TCP packets of 1 byte, then one TCP packet with the rest of the Client Hello. I have no idea why, it seems to be random, affecting about 5% of the webhook deliveries.

See this wireshark capture: there's a few 1-byte packets, then the packet with id 5793 sends the rest of the Client Hello:

screenshot-2024-06-03_22-23-27

The TCP proxy is doing a single TCP Read() call to read the Client Hello, here.

https://github.com/kubernetes/ingress-nginx/blob/c8722b225d06bdeb8385114886549e021c453edd/pkg/tcpproxy/tcp.go#L65

This read is receiving only the 1st byte. This causes parser.GetHostname(data) to fail, which causes the TCP connection to be incorrectly routed to nginx, instead of to the TLS passthrough destination. nginx-ingress doesn't have the certificate for this host, causing the TLS handshake to fail.

I'm not sure how to fix this. Seems the proxy should read in a loop until it's received the whole Client Hello?

k8s-ci-robot commented 1 month ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
longwuyuan commented 1 month ago

@Dirbaio the tcpdump and its analysis looks like a super significant and highly effective debug piece of information.

But this being the githib issues section for the project, its better if the issue report is descriptive and detailed.

The template of a new bug report will show you questions that are asked by the project. Please look at that template. Then edit your issue description and answer those questions. It will help.

Github's response component in a client HELO and the ingress nginx controller added with a TCP proxy sort of require a deep diving expert to even comprehend what problem you are reporting. Reproducing the problem is also a guess work based on the info provided.

/remove-kind bug /kind support /triage needs-information

alshain commented 1 month ago

We are impacted by this through a variation of tldr.fail/

Due to Chrome now enabling hybridized Kyber support by default, ClientHello is always fragmented and our users run into this bug every now and then with requests failing randomly with cert errors.

(Unlikely) If the initial page load fails, the user is greeted by a cert error.

(Much more likely) However, if a background request fails, the webpage will just randomly misbehave: CSS might be missing, some JavaScript might not load, a JS-triggered request might fail etc. These cert errors are only visible in the browser console.

github-actions[bot] commented 5 days ago

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.