kubernetes / ingress-nginx

Ingress NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.5k stars 8.26k forks source link

TCPProxy for SSL passthrough does not respect wildcard domains #11982

Closed rkevin-arch closed 3 weeks ago

rkevin-arch commented 1 month ago

What happened: I am trying to do SSL passthrough to a backend service running outside of my k8s cluster that handles TLS by itself. I have rules for both somedomain.com and *.somedomain.com. Everything works fine for somedomain.com, but not for subdomains *.somedomain.com (for this one, ingress-nginx attempted to do TLS termination itself rather than forward to the backend, then gave a 400 The plain HTTP request was sent to HTTPS port error). Everything works if I explicitly list out subdomains, but not with the wildcard.

What you expected to happen: SSL passthrough works even with wildcard domains

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v1.10.0
  Build:         71f78d49f0a496c31d4c19f095469f3f23900f8a
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.25.3

-------------------------------------------------------------------------------

Kubernetes version (use kubectl version): v1.30.3

Environment: Self-hosted baremetal cluster built with kubeadm, LoadBalancer type services are served with metallb. I am fairly certain the issue has nothing to do with my environment, though, so I'll skip some of the environment stuff.

How to reproduce this issue: Create the following ingress object:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/ssl-passthrough: "true"
  name: service-forwarding
  namespace: static-sites
spec:
  ingressClassName: nginx-external
  rules:
  - host: somedomain.com
    http:
      paths:
      - backend:
          service:
            name: service-forwarding
            port:
              name: http
        path: /
        pathType: Prefix
  - host: '*.somedomain.com'
    http:
      paths:
      - backend:
          service:
            name: service-forwarding
            port:
              name: http
        path: /
        pathType: Prefix

Hit the somedomain.com domain. There should be no ingress-nginx logs about it (cuz it is passing TLS through), and TLS traffic should hit the backend service-forwarding service (you can set it up as whatever, maybe just a vanilla webserver that handles HTTPS). This is all expected

Hit the whatever.somedomain.com domain. TLS is not passed through, and you will see something like this appear in ingress-nginx logs, indicating that ingress-nginx indeed tried to do TLS termination:

192.168.11.132 - - [17/Sep/2024:11:12:48 +0000] "GET / HTTP/2.0" 400 248 "-" "curl/8.9.1" 33 0.003 [static-sites-service-forwarding-http] [] 192.168.10.2:443 248 0.002 400 b6e7121345da0cbd1aecee5b3803774c

On the curl side, you'll see:

rkevin@hadron:~$ curl -kv https://whatever.somedomain.com/
* Host whatever.somedomain.com:443 was resolved.
* IPv6: (none)
* IPv4: SOMEPUBLICIP
*   Trying SOMEPUBLICIP:443...
* Connected to whatever.somedomain.com (SOMEPUBLICIP) port 443
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 / x25519 / RSASSA-PSS
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=wrongdomainfrom.ingressnginx.defaultcert.com
*  start date: Sep 11 11:56:55 2024 GMT
*  expire date: Dec 10 11:56:54 2024 GMT
*  issuer: C=US; O=Let's Encrypt; CN=R11
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
*   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://whatever.somedomain.com/
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: whatever.somedomain.com]
* [HTTP/2] [1] [:path: /]
* [HTTP/2] [1] [user-agent: curl/8.9.1]
* [HTTP/2] [1] [accept: */*]
> GET / HTTP/2
> Host: whatever.somedomain.com
> User-Agent: curl/8.9.1
> Accept: */*
> 
* Request completely sent off
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/2 400 
< date: Tue, 17 Sep 2024 11:12:48 GMT
< content-type: text/html
< content-length: 248
< strict-transport-security: max-age=31536000; includeSubDomains
< 
<html>
<head><title>400 The plain HTTP request was sent to HTTPS port</title></head>
<body>
<center><h1>400 Bad Request</h1></center>
<center>The plain HTTP request was sent to HTTPS port</center>
<hr><center>nginx</center>
</body>
</html>
* Connection #0 to host whatever.somedomain.com left intact

Anything else we need to know:

I believe the problem is that TCPProxy is just doing a simple string match here, which means it won't match wildcard domain names in p.ServerList. I can probably make a PR if you're OK with doing a "split string on . and match subdomain if wildcards are in play" kinda thing, but I'm not sure if I should be doing that (feels a bit jank) or if there is a library in use by ingress-nginx that already does this, so I'll leave it to the experts. LMK if a PR would help, though.

k8s-ci-robot commented 1 month ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
longwuyuan commented 1 month ago

/remove-kind bug

Makes it easier for readers to make comments based on data. Never seen anyone say that the ssl-passthrough destination is outside the cluster. The features is not designed for that use-case.

It makes no sense that you have ssl-passthrough without the foce-ssl-redirect annotation. Maybe it works. No idea.

How can the port for that backend-service be "HTTP" when it is the destination of a TLS connection termination

longwuyuan commented 1 month ago

/kind support /triage needs-information

rkevin-arch commented 1 month ago

It shouldn't particularly matter whether ssl-passthrough's destination is in or outside the cluster (in my case it is outside, but either should have this issue). Also, I have foce-ssl-redirect globally enabled, but I also don't think that particularly matters if you're just hitting the HTTPS port directly (that annotation only controls a port 80 -> 443 redirect).

As for the backend service port being HTTP, that is just unfortunate naming, the port is indeed 443 and accepts TLS connections.

I'll write up a repro setup on minikube that does not rely on any of my setup.

longwuyuan commented 1 month ago

Thanks for the updated comments.

Also, you may want to join the Kubernetes slack and discuss this in the ingress-nginx-users channel because there are more engineers & developers there, while there are very few eyes here.

rkevin-arch commented 1 month ago

Here's a repro script that should work with minikube. There is no HTTPS server outside of k8s and nothing weird, and should show that the issue is purely in how TCPProxy handles wildcard hostnames here.

Here's the script:

#!/bin/bash

minikube delete
minikube start

# install and configure ingress-nginx
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/baremetal/deploy.yaml

# we need to enable ssl passthrough on the controller
kubectl patch -n ingress-nginx deploy ingress-nginx-controller --type='json' -p '[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--enable-ssl-passthrough=true"}]'

# wait for ingress-nginx to be ready
kubectl rollout status -n ingress-nginx deploy ingress-nginx-controller

# add dummy app that listens on port 443
# we're gonna ignore TLS cert validity for this
# using apache to avoid confusion with nginx
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: https-svc
spec:
  replicas: 1
  selector:
    matchLabels:
      app: https-svc
  template:
    metadata:
      labels:
        app: https-svc
    spec:
      containers:
      - name: https-svc
        image: httpd:2.4
        ports:
        - containerPort: 443
        args:
        - bash
        - -c
        - |
          openssl req -x509 -nodes -days 365 -newkey rsa:2048 -subj /CN=selfsigned/ -keyout /usr/local/apache2/conf/server.key -out /usr/local/apache2/conf/server.crt
          sed -i \
              -e 's/^#\(Include .*httpd-ssl.conf\)/\1/' \
              -e 's/^#\(LoadModule .*mod_ssl.so\)/\1/' \
              -e 's/^#\(LoadModule .*mod_socache_shmcb.so\)/\1/' \
              conf/httpd.conf
          echo 'Hi! If you see this, you have hit the backend webserver properly' > /usr/local/apache2/htdocs/index.html
          httpd-foreground
---
apiVersion: v1
kind: Service
metadata:
  name: https-svc
  labels:
    app: https-svc
spec:
  ports:
  - port: 443
    targetPort: 443
    protocol: TCP
    name: https
  selector:
    app: https-svc
EOF

# wait for service to be ready
kubectl rollout status deploy https-svc

# create ingress object
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/ssl-passthrough: "true"
  name: svc-ingress
spec:
  ingressClassName: nginx
  rules:
  - host: 'somedomain.com'
    http: &http
      paths:
      - backend:
          service:
            name: https-svc
            port:
              name: https
        path: /
        pathType: Prefix
  - host: 'abc.somedomain.com'
    http: *http
  - host: 'def.somedomain.com'
    http: *http
  - host: '*.somedomain.com'
    http: *http
EOF

# sleep a bit for good measure

sleep 5s

# grab ingress host/port
HOST=$(minikube ip)
PORT=$(kubectl get svc -o json -n ingress-nginx ingress-nginx-controller | jq '.spec.ports[]|select(.name=="https")|.nodePort')

echo
echo "Trying somedomain.com. This should be successful"
curl -k https://somedomain.com:$PORT --resolve somedomain.com:$PORT:$HOST
echo

echo "Trying abc.somedomain.com. This should be successful"
curl -k https://abc.somedomain.com:$PORT --resolve abc.somedomain.com:$PORT:$HOST
echo

echo "Trying def.somedomain.com. This should be successful"
curl -k https://def.somedomain.com:$PORT --resolve def.somedomain.com:$PORT:$HOST
echo

echo "Trying ghi.somedomain.com. This isn't successful, even though it matches the *.somedomain.com rule, indicating a bug"
curl -k https://ghi.somedomain.com:$PORT --resolve ghi.somedomain.com:$PORT:$HOST
echo

Here is the output from the last couple of lines:

Trying somedomain.com. This should be successful
Hi! If you see this, you have hit the backend webserver properly

Trying abc.somedomain.com. This should be successful
Hi! If you see this, you have hit the backend webserver properly

Trying def.somedomain.com. This should be successful
Hi! If you see this, you have hit the backend webserver properly

Trying ghi.somedomain.com. This isn't successful, even though it matches the *.somedomain.com rule, indicating a bug
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>400 Bad Request</title>
</head><body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not understand.<br />
Reason: You're speaking plain HTTP to an SSL-enabled server port.<br />
 Instead use the HTTPS scheme to access this URL, please.<br />
</p>
</body></html>

Also, thanks for the info about the Kubernetes slack. If I don't get a response here, I'll probably join there and ask around.

longwuyuan commented 1 month ago

Hey, I tested this and I think the root-cause is SNI missing in the request because requested hostname is not explicit in the ingress TLS sections.

longwuyuan commented 1 month ago

and a developer has to comment that wildcard * is upported or not for ssl-passthrough.

I think its not supported to use wildcard with ssl-passthrough. I think there is another issue on this where a developer has clarified this. I will have to check.

rkevin-arch commented 1 month ago

The issue isn't SNI (curl does do SNI properly, hence me using --resolve). The issue is that wildcards are not supported for ssl-passthrough, but I think the fix should be fairly doable. I can take a crack at it if you want

rkevin-arch commented 1 month ago

Hi, any further input on this issue or the linked PR here? Or do I need to register for an account on the Kubernetes slack to bring this up to relevant folks?

rkevin-arch commented 3 weeks ago

From the Kubernetes slack channel:

currently wildcard is not supported and ssl-passthrough is targeted for deprecation. on top of that wildcard is less secure only when compared to explicit FQDN in list of hosts. So its not likely to persue this as it is. The plan is to provide ssl-passthrough in a more native-to-nginx way or make it happen in the Gateway-API implementation

So this issue will be closed as "won't-fix".