kubernetes / ingress-nginx

Ingress NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.46k stars 8.25k forks source link

ingress-nginx not capturing origin IP with ExternalName service types #11753

Open boarder981 opened 2 months ago

boarder981 commented 2 months ago

What happened:

I need to get the real IP of incoming requests, so I have added the following to my ingress-nginx configMap:

use-forwarded-headers: "true"
enable-real-ip: "true"

Requests going to ingresses bound to services with type ClusterIP are showing the real public IP address of the origin, as expected. These are requests that go directly to app pod(s) running on that same cluster. Example log:

20.X.X.X - - [07/Aug/2024:16:01:53 +0000] "GET /status HTTP/2.0" 200 20 "http://X.X.X.X/status" "health-agent" 217 0.005 [my-app-svc] [] 192.168.2.22:8443 20 0.005 200 034df34486c5fa63fd82a4d003043b26

However, requests going to ingresses bound to services with type ExternalName are showing either the K8s node IP or default gateway of the pod (192.168.1.1). I am using ExternalName services to proxy API requests from the internet to apps running on a non-public-facing K8s cluster. Here are some nginx log examples:

192.168.1.1 - - [09/Aug/2024:17:51:27 +0000] "GET /my/app/path HTTP/2.0" 202 172 "-" "fakeagent" 47 1.573 [my-backend-app-proxy-svc] [] 10.219.X.X:443 172 1.573 202 4bb02a257cc3b62cae21c5fec93e95b3

10.219.X.X - - [09/Aug/2024:17:51:32 +0000] "GET /my/app/path HTTP/2.0" 202 172 "-" "fakeagent" 47 4.199 [my-backend-app-proxy-svc] [] 10.219.X.X:443 172 4.200 202 09eec90186a60bcc376cb7a894bff048

The service definition is very basic:

---
kind: Service
apiVersion: v1
metadata:
  name: my-backend-app-proxy-svc
  namespace: my-ns
spec:
  type: ExternalName
  externalName: my-backend-app.int.company.com

The ingress definitions used for both the ClusterIP and ExternalName services are very similar. Here is an example of one that uses the ExternalName service

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx-external
    nginx.ingress.kubernetes.io/backend-protocol: HTTPS
    nginx.ingress.kubernetes.io/ssl-passthrough: "true"
    nginx.ingress.kubernetes.io/upstream-vhost: my-backend-app.int.company.com
    nginx.ingress.kubernetes.io/use-regex: "true"
  name: my-backend-app-ingress
  namespace: my-ns
spec:
  rules:
  - host: api.company.com
    http:
      paths:
      - backend:
          service:
            name: my-backend-app-proxy-svc
            port:
              number: 443
        path: /my/app/path$
        pathType: ImplementationSpecific
  tls:
  - hosts:
    - api.company.com
    secretName: my-tls-secret

The ClusterIP ingresses have basically the same format, except that it doesn't specify a path or use the nginx.ingress.kubernetes.io/upstream-vhost annotation. I can't identify any other major differences.

What you expected to happen:

Incoming requests to ingresses using the ExternalName service log the real origin IP

NGINX Ingress controller version:

NGINX Ingress controller
  Release:       v1.8.1
  Build:         dc88dce9ea5e700f3301d16f971fa17c6cfe757d
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.21.6

Kubernetes version:

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.2", GitCommit:"5835544ca568b757a8ecae5c153f317e5736700e", GitTreeState:"clean", BuildDate:"2022-09-21T14:33:49Z", GoVersion:"go1.19.1", Compiler:"gc", Platform:"darwin/amd64"}

Environment:

Azure AKS (Kubernetes 1.27.9)

Other:

Note that all these requests are working properly and making it to the appropriate backends. However, I need the real IP in order to setup IP whitelisting with the nginx.ingress.kubernetes.io/whitelist-source-range annotation

Any help would be greatly appreciated. Thank you.

k8s-ci-robot commented 2 months ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
longwuyuan commented 2 months ago

/remove-kind bug /kind support

longwuyuan commented 2 months ago

Just to state the obvious, nothing stops anyone from using the service --type ExterName in the way you describe. But AFAIK that was not the typical use-case for which that service --type was intended. In the context that while it worked for the bouncing you needed but its not built with components similar to a router or a firewall. There are hardly any docs that would explain the intricate Layer4 & Layer7 happenings in the use-case you explained, when it comes to retention of real-client-information, across the hops involved in this use-case.

Gacko commented 2 months ago

Can you please check if

a) the header is being passed to your external service and b) your external service is correctly configured to accept this header from this specific source?

Normally you tell your service from which IP ranges to accept such information as otherwise it could easily be spoofed by anyone. In NGINX e.g. this is handled via the Real IP module (https://nginx.org/en/docs/http/ngx_http_realip_module.html). There you can define from which IP addresses to accept the real IP.

In your case these IP addresses might not be the same as your pod or the node it is running on, depending on your network setup.

boarder981 commented 2 months ago

/remove-kind bug /kind support

* If ssl-passthrough is configured then the use of the backend-protocol annotation makes no sense. The connection is passing through the controller, instead of a connection terminating at the controller. Am I misunderstanding or missing something here ?

* And since the backend-service is of --type ExternalName, I am not entirely clear as to what client information gets retained across the network-hops and what client information does not get retained. At the very least, my assumption is that you would need to have proxy-procol enabled on the hops involved. Like proxy-protocol on the LB in front of the controller and also proxy-protocol on the controller. Glad to learn more if I missed something.
2024/08/06 17:44:03 [error] 1929#1929: *194902368 broken header: "/+��G��;���SS�7Os�WG:
                                                                                       �.z
��;�o���1��,�+�0�/�$�#�(�'�" while reading PROXY protocol, client: 192.168.1.1, server: 0.0.0.0:443    �pz����:6�

As far as I know, standard Azure load balancers don't support proxy protocol, so I assume this is why.


Normally you tell your service from which IP ranges to accept such information as otherwise it could easily be spoofed by anyone. In NGINX e.g. this is handled via the Real IP module (https://nginx.org/en/docs/http/ngx_http_realip_module.html). There you can define from which IP addresses to accept the real IP.

This is exactly what I'm trying to achieve and I already enabled real-ip in the configuration. The problem is that nginx does not show the real IP for requests going to my services with type ExternalName.

Gacko commented 2 months ago

As far as I know Ingress NGINX does not support PROXY protocol for upstreams, maybe not even NGINX itself does. So this approach can not be implemented anyway.

I already enabled real-ip in the configuration.

This is only client-facing. For upstream facing, you need to make sure NGINX is handing them to the upstream. Have you enabled use-forwarded-headers?

https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#use-forwarded-headers

EDIT: Sorry, the latter does not apply to your use case.

boarder981 commented 2 months ago

As far as I know Ingress NGINX does not support PROXY protocol for upstreams, maybe not even NGINX itself does. So this approach can not be implemented anyway.

This is good to know. What's strange to me if that ingress-nginx on my public-facing cluster does get the real IP for incoming requests to "regular" services (the ones fronting pods on that same cluster). However, ingress-nginx does not log the real IP for requests going to the ExternalName services. I'm talking about the same log file, ingress-nginx on that external cluster. For example:

Request from internet --> external cluster --> pod on external cluster (this works, nginx logs the real IP)

Request from internet --> external cluster (ExternalName service) --> pod on internal cluster (does not work, nginx logs either the K8s node IP or pod default gateway 192.168.1.1)

Please note that I'm not referring to any nginx logs on the internal cluster - this is only from the perspective of the external cluster.

This is only client-facing. For upstream facing, you need to make sure NGINX is handing them to the upstream. Have you enabled use-forwarded-headers?

Yes, from my original post I added this to the ingress-nginx config on the external and internal clusters:

use-forwarded-headers: "true"
enable-real-ip: "true"
Gacko commented 2 months ago

Oh, wait, NOW I got you. I misunderstand you and thought you're talking about Ingress NGINX not handing the source IP information to your internal cluster.

So you mean logging inside the Ingress NGINX on your external cluster differs depending on the target upstream, right? That's strange.

I'm working on something different atm that needs to be done asap, but I'd like to have a deeper look here later.

boarder981 commented 2 months ago

Exactly! I don't even care if the real IP makes it to the backend. I only want to prevent non-whitelisted IPs from making requests to certain ingresses on the external cluster.

As a comparison, here is an example log entry for a request going to one of the pods running on that same external cluster. You can see that it logs the real 20.X.X.X origin IP address.

20.X.X.X - - [07/Aug/2024:16:01:53 +0000] "GET /status HTTP/2.0" 200 20 "http://X.X.X.X/status" "health-agent" 217 0.005 [my-app-svc] [] 192.168.2.22:8443 20 0.005 200 034df34486c5fa63fd82a4d003043b26

but here is a log entry for a request going to one of the ExternalName services (which then gets forwarded to the internal cluster - the 10.219.X.X address)

192.168.1.1 - - [09/Aug/2024:17:51:27 +0000] "GET /my/app/path HTTP/2.0" 202 172 "-" "fakeagent" 47 1.573 [my-backend-app-proxy-svc] [] 10.219.X.X:443 172 1.573 202 4bb02a257cc3b62cae21c5fec93e95b3

Thanks for your help. This problem is really baffling me.

github-actions[bot] commented 1 month ago

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.