nginxinc / kubernetes-ingress

NGINX and NGINX Plus Ingress Controllers for Kubernetes
https://docs.nginx.com/nginx-ingress-controller
Apache License 2.0
4.68k stars 1.97k forks source link

TLS offload is not working for inner-cluster requests #6426

Open evheniyt opened 2 months ago

evheniyt commented 2 months ago

Version

3.6.2

What Kubernetes platforms are you running on?

Kind

What happened?

After updating from 3.2.1 to 3.3.0 (also tried with 3.6.2) we found that TLS offload stopped working for requests that are coming from inside the cluster.

Our coredns is configured to resolve some DNS like api.example.com to svc.cluster.local address. Like this:

rewrite name api.example.com api.test-services.svc.cluster.local

And that setup was working fine with 3.2 version of the controller, and we could successfully request https://api.example.com from inside the cluster.

After updating to a new version of the controller we found that that functionality stopped working (for both Ingress and VirtualServer). At the same time, HTTPS requests outside the cluster works fine. Also, HTTP requests work fine inside the cluster, but HTTPS - doesn't.

api-pod # curl https://api.example.com -v
* Host api.example.com:443 was resolved.
* IPv6: (none)
* IPv4: 10.110.34.14
*   Trying 10.110.34.14:443...
* connect to 10.110.34.14 port 443 from 10.244.0.30 port 51694 failed: Connection refused

We are installing controller with helm chart and this values:

nginx-ingress:
  controller:
    kind: daemonset
    defaultTLS:
      secret: test-infrastructure/ingress-ssl
    wildcardTLS:
      secret: test-infrastructure/ingress-ssl
    tolerations:
      - key: ci
        effect: "NoSchedule"
    resources:
      requests:
        cpu: 50m
        memory: 250Mi
    ingressClass:
      setAsDefaultIngress: true
    enableCustomResources: true
    enableSnippets: true
    enablePreviewPolicies: true
    enableTLSPassthrough: true
    hostNetwork: true
    config:
      entries:
        redirect-to-https: "True"
        log-format-escaping: "json"
        log-format: '{"remote_addr": "$remote_addr", "remote_user": "$remote_user", "time_local": "$time_local", "request_uri": "$request_uri", "request_method": "$request_method", "status": "$status", "body_bytes_sent": "$body_bytes_sent", "http_referer": "$http_referer", "http_user_agent": "$http_user_agent", "request_length": "$request_length", "request_time": "$request_time", "upstream_addr": "$upstream_addr", "upstream_response_length": "$upstream_response_length", "upstream_response_time": "$upstream_response_time", "upstream_status": "$upstream_status", "request_body": "$request_body"}'
        client-max-body-size: 256m
        ssl-protocols: "TLSv1 TLSv1.1 TLSv1.2 TLSv1.3"
        ssl-ciphers: "ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES256-SHA256:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA"
        proxy-read-timeout: "600s"
        server-snippets:
          gzip on;
          gzip_types text/plain text/css text/xml text/javascript application/json application/x-javascript application/xml;
          large_client_header_buffers 4 32k;
        http-snippets: |
          proxy_cache_path /tmp/nginx-cache levels=1:2 keys_zone=api-cache:2m max_size=200m inactive=7d use_temp_path=off;
          proxy_cache_key $scheme$proxy_host$request_uri$request_body;
          proxy_cache_lock on;
        http2: "True"
    service:
      type: ClusterIP

The only thing we have added while updating from 0.18.1 chart to 1.0.0 is hostNetwork: true without which ingress wasn't working at all.

Steps to reproduce

No response

Expected behaviour

No response

Kubectl Describe output

No response

Log output

No response

Contributing Guidelines

github-actions[bot] commented 2 months ago

Hi @evheniyt thanks for reporting!

Be sure to check out the docs and the Contributing Guidelines while you wait for a human to take a look at this :slightly_smiling_face:

Cheers!

jjngx commented 2 months ago

@evheniyt we will try to reproduce the issue and will get back to you

jjngx commented 2 months ago

@evheniyt what version of kind are you using? Could you also share your kind config?

➜  kubernetes-ingress git:(main) ✗ kind --version
kind version 0.24.0
evheniyt commented 2 months ago

I'm not using kind. Kubernetes version is 1.29

jjngx commented 2 months ago

I'm not using kind. Kubernetes version is 1.29

What Kubernetes platforms are you running on?

Kind

ok, thanks

evheniyt commented 2 months ago

Self-hosted on Hetzner

jjngx commented 2 months ago

@evheniyt wen you tested NIC v3.6.2, what version of NIC Helm chart were you using?

evheniyt commented 2 months ago

1.3.2

pdabelf5 commented 1 month ago

Hi @evheniyt

Based on your example

api-pod # curl https://api.example.com -v
* Host api.example.com:443 was resolved.
* IPv6: (none)
* IPv4: 10.110.34.14
*   Trying 10.110.34.14:443...
* connect to 10.110.34.14 port 443 from 10.244.0.30 port 51694 failed: Connection refused

You are making a https request from inside the cluster direct to the api.example.com/api.test-services.svc.cluster.local Service. Given you are not routing the request via an Ingress or VirtualServer, do the pods handling these requests perform TLS?

evheniyt commented 1 month ago

hi @pdabelf5,

You are right, api.example.com resolves to Service of the application. The interesting part is that this application listens only on 80 port but I could make https requests and I see a response from nginx ingress controller. So looks like all my requests to the internal IP of the Service are going through the nginx controller. Not sure why exactly it works like this, maybe because of the hostNetwork: true...

That behavior stopped working in 3.3.0.

pdabelf5 commented 1 month ago

Hi @evheniyt , Unfortunately I was not able to reproduce your case using v3.2.1 of NGINX Ingress Controller. However, this may be due differences in how Kubernetes networking is configured on my environment (I used EKS) and your baremetal setup in Hetzner.

Is your need to access https://api.example.com from within the cluster? If so I will try to find working example for you.