kubernetes / ingress-nginx

Ingress NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.41k stars 8.24k forks source link

Upstream Prematurely Closed Connection While Reading Response Header From Upstream #11244

Closed ChristianBieri1995 closed 5 months ago

ChristianBieri1995 commented 6 months ago

What happened:

To route traffic into my Kubernetes cluster, I use an F5 load balancer, nginx ingress controllers (horizontal pod autoscaler) and k8s ingress objects. By doing so, I receive 502 HTTP status codes on a regular basis. Although low in percentage (0.003%), the errors must not be neglected as millions of requests are handled on a daily basis.

After having spent hours to detect the issue, I stumbled upon the following error: 2024/04/10 09:13:00 [error] 3980#3980: *29815576 upstream prematurely closed connection while reading response header from upstream

With that information at hand, I skimmed countless webpages to identify the root cause. After having checked the most obvious culprits such as

I could not achieve any progress. As a next step, I activated the error-log-level: debug to catch the corresponding logs (see relevant parts below):

2024-03-26T16:21:04.076718973+01:00 stderr F 2024/03/26 15:21:04 [debug] 136#136: 943568 recv: fd:46 0 of 1048576 2024-03-26T16:21:04.076726784+01:00 stderr F 2024/03/26 15:21:04 [error] 136#136: 943568 upstream prematurely closed connection while reading response header from upstream (here, I left out some irrelevant parts) 2024-03-26T16:21:04.076750704+01:00 stderr F 2024/03/26 15:21:04 [debug] 136#136: 943568 http next upstream, 2 2024-03-26T16:21:04.076756887+01:00 stderr F 2024/03/26 15:21:04 [debug] 136#136: 943568 free keepalive peer 2024-03-26T16:21:04.076762948+01:00 stderr F 2024/03/26 15:21:04 [debug] 136#136: 943568 lua balancer free peer, tries: 2 2024-03-26T16:21:04.076768572+01:00 stderr F 2024/03/26 15:21:04 [debug] 136#136: 943568 finalize http upstream request: 502 2024-03-26T16:21:04.076774231+01:00 stderr F 2024/03/26 15:21:04 [debug] 136#136: 943568 finalize http proxy request 2024-03-26T16:21:04.07677987+01:00 stderr F 2024/03/26 15:21:04 [debug] 136#136: 943568 close http upstream connection: 46

Unfortunately, this did not help either but it indicated that there may be an issue with network connections that are reused (HTTP1.1). Therefore, I added nginx.ingress.kubernetes.io/proxy-http-version: "1.0" to the relevant k8s ingress object, and behold: No 502 HTTP status codes anymore. I could not only replicate this behaviour on my test environment, but also on more relevant stages.

In my view, there seems to be an issue with reusing established connections, coming along with HTTP1.1 - probably with my nginx.conf.

NGINX Ingress version: nginx version: nginx/1.21.6

Kubernetes version: Client Version: v1.28.2 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.28.5+rke2r1

Environment:

NAME                                                    READY   STATUS    RESTARTS      AGE
pod/ingress-nginx-tst-controller-5bdbb97d9-dv8kb        1/1     Running   1 (16d ago)   21d
pod/ingress-nginx-tst-defaultbackend-76d48c597b-265xt   1/1     Running   0             21d

NAME                                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
service/ingress-nginx-tst-controller           NodePort    x.x.x.x          <none>        80:32080/TCP,443:32443/TCP   176d
service/ingress-nginx-tst-controller-metrics   ClusterIP   x.x.x.x          <none>        10254/TCP                    176d
service/ingress-nginx-tst-defaultbackend       ClusterIP   x.x.x.x          <none>        80/TCP                       176d

NAME                                               READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ingress-nginx-tst-controller       1/1     1            1           176d
deployment.apps/ingress-nginx-tst-defaultbackend   1/1     1            1           176d

NAME                                                          DESIRED   CURRENT   READY   AGE
replicaset.apps/ingress-nginx-tst-controller-5bdbb97d9        1         1         1       58d
replicaset.apps/ingress-nginx-tst-defaultbackend-76d48c597b   1         1         1       58d

NAME                                                               REFERENCE                                 TARGETS            MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/ingress-nginx-tst-controller   Deployment/ingress-nginx-tst-controller   62%/90%, 10%/50%   1         1         1          176d

--> Please note that the replicas/HPA values were set to 1 for easier identification of the issues described. The issue also arises with multiple replicas, however.
Name:             ingress-nginx-tst-controller-5bdbb97d9-dv8kb
Namespace:        infra-ingress-tst
Priority:         0
Service Account:  default
Node:             x.x.x.x
Start Time:       Wed, 03 Apr 2024 14:23:15 +0200
Labels:           app.kubernetes.io/component=controller
                  app.kubernetes.io/instance=ingress
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=ingress-nginx
                  app.kubernetes.io/part-of=ingress-nginx
                  app.kubernetes.io/version=1.9.6
                  helm.sh/chart=ingress-nginx-4.9.1
                  pod-template-hash=5bdbb97d9
Annotations:      container.apparmor.security.beta.kubernetes.io/controller: localhost/k8s-ingress-controller
                  kubernetes.io/limit-ranger:
                    LimitRanger plugin set: cpu, memory request for init container opentelemetry; cpu, memory limit for init container opentelemetry
Status:           Running
IP:               x.x.x.x
IPs:
  IP:           x.x.x.x
Controlled By:  ReplicaSet/ingress-nginx-tst-controller-5bdbb97d9
Init Containers:
  opentelemetry:
    Container ID:    containerd://ab2e25842abdd00da17f3404fcea0842961be8301430fe70a7f7e9ff4ee9e2e4
    Image:           x/ingress-nginx/opentelemetry:v20230721-3e2062ee5@sha256:13bee3f5223883d3ca62fee7309ad02d22ec00ff0d7033e3e9aca7a9f60fd472
    Image ID:        x/ingress-nginx/opentelemetry@sha256:13bee3f5223883d3ca62fee7309ad02d22ec00ff0d7033e3e9aca7a9f60fd472
    Port:            <none>
    Host Port:       <none>
    SeccompProfile:  RuntimeDefault
    Command:
      /init_module
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 03 Apr 2024 14:23:15 +0200
      Finished:     Wed, 03 Apr 2024 14:23:15 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     250m
      memory:  100Mi
    Requests:
      cpu:        100m
      memory:     50Mi
    Environment:  <none>
    Mounts:
      /modules_mount from modules (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-54wkm (ro)
Containers:
  controller:
    Container ID:    containerd://1461ce776d455eab59334df36faf03b0c19080e6efe727a4c940271429706bba
    Image:           x/ingress-nginx/controller:v1.9.6@sha256:1405cc613bd95b2c6edd8b2a152510ae91c7e62aea4698500d23b2145960ab9c
    Image ID:        x/ingress-nginx/controller@sha256:1405cc613bd95b2c6edd8b2a152510ae91c7e62aea4698500d23b2145960ab9c
    Ports:           80/TCP, 443/TCP, 10254/TCP
    Host Ports:      0/TCP, 0/TCP, 0/TCP
    SeccompProfile:  RuntimeDefault
    Args:
      /nginx-ingress-controller
      --default-backend-service=$(POD_NAMESPACE)/ingress-nginx-tst-defaultbackend
      --publish-service=$(POD_NAMESPACE)/ingress-nginx-tst-controller
      --election-id=ingress-nginx-tst-leader
      --controller-class=k8s.io/ingress-nginx
      --ingress-class=nginx
      --configmap=$(POD_NAMESPACE)/ingress-nginx-tst-controller
      --watch-ingress-without-class=true
      --default-ssl-certificate=infra-ingress-tst/tls-certificates-tst-wildcard
      --enable-ssl-passthrough
    State:          Running
      Started:      Tue, 09 Apr 2024 07:18:25 +0200
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 03 Apr 2024 14:23:16 +0200
      Finished:     Tue, 09 Apr 2024 07:18:25 +0200
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     800m
      memory:  800Mi
    Requests:
      cpu:      100m
      memory:   300Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       ingress-nginx-tst-controller-5bdbb97d9-dv8kb (v1:metadata.name)
      POD_NAMESPACE:  infra-ingress-tst (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /modules_mount from modules (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-54wkm (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  modules:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-54wkm:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>
- **Current state of ingress object, if applicable**:
  - `kubectl -n <appnamespace> describe ing <ingressname>`:

apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: creationTimestamp: "2024-04-03T09:42:21Z" generation: 1 labels: app: bs-simplesample-az-srv app.kubernetes.io/instance: bs-simplesample-az-srv app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: bs-simplesample-az-srv chart: generic-openliberty-2.3.4 helm.sh/chart: generic-openliberty-2.3.4 heritage: Helm release: bs-simplesample-az-srv name: bs-simplesample-az-srv-tst-business-services namespace: kern-sample-bsfw-tst resourceVersion: "181093092" uid: 0c38ad0b-d4a4-4ecc-9db5-183b63143b4e spec: rules:

reset_timedout_connection on;
keepalive_timeout  75s;
keepalive_requests 1000;

May there be an issue with timeouts related to established connections?

How to reproduce this issue: Simply by accessing the backend via F5 and nginx ingress controller. To do so, I created a load tester.

k8s-ci-robot commented 6 months ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
longwuyuan commented 6 months ago

/remove-kind bug

/triage needs-information /kind support

ChristianBieri1995 commented 6 months ago

Thank you for the prompty reply.

  • Using a F5 LB and sending traffic manually to ingress-controller over a NodePort is not tested in CI
  • Most common use-case where the debugging of L7 nuances becomes viable is when a service-type LoadBalancer is created by the installation of the ingress-nginx controller and which in turn triggers a LB to be provisioned by the infra-provider like the cloud-provider or a alternative like Metallb

Sorry for the confusion, I only added this information for people to better understand my cluster architecture. The issue arises in the nginx ingress controller.

  • Yes the timeouts configured on the controller have a impact if the use-case matches but troubleshooting such use-case is not the same as referring to a CI test
  • Edit the issue description and provide information as asked in a new issue template. That way at least a reader gets the related info to analyse

You mention "provide information as asked" but I can't find which information was asked for? And where should I open a new issue and with which template?

Thank you already in advance for your reply.

longwuyuan commented 6 months ago
ChristianBieri1995 commented 6 months ago

Thank you for clarification, I will update it asap!

ChristianBieri1995 commented 6 months ago
  • Do not create a new bug or issue. Just read the template

    • Click open new bug report
    • Look at the default template of the new bug report
    • Then edit your issue description and answer the questions asked in the template in this issue's description

Done, I have provided as much information as possible related to the issue described above.

  • The information about the F5 & controller-service-type-NodePort is important and it was required so thank you. But the implication is that if you install this project's ingress-nginx controller as per documentation and so create a service-type LoadBalancer, then bare-metal environment is better if you have deployed Metallb.org as that is widely used and you can expect other Metallb users to comment. If you create a service type NodePort then the termination of the connection is on the F5 and not on the controller. So that is a very very very different use-case and not tested so you are in control of how the L3/4 and L7 traffic behaves.

When I understand you correctly, you recommend switching to service type loadBalancer, leveraging metallb? However, Using service Type NodePort seems to be supportet as well according to https://docs.nginx.com/nginx-ingress-controller/installation/installing-nic/installation-with-manifests/#option-1-create-a-nodeport-service. Is the problem that we use the F5 in combination with the service type NodePort?

  • You can expect more involvement if you terminate the connection from external-clients on the ingress-controller. So now you have to wait for experts who know about terminating connection on a F5 and then establishing new connection from F5 to NodePort etc etc etc

I am not aware whether termination on the ingress controllers is possible. I have to check this internally with my team. Nonetheless, does it even matter what happens "before" the ingress controller as the issue seems to arise when connections between the ingress controllers and the backends are established/reused?

  • If the controller dropped/denied/interrupted connections and was the root-cause of 502 response, then yo uhave to provide detailed step by step instruction on how someone can reproduce the problem on a minikube cluster or a kind cluster. Otherwise it becomes difficult to know which part of the ingress-nginx controller code is causing the problem

Unfortunately, I am not able to provide any further information as the one above. However, the problem seems to arise on a regular basis (0.003% of all requests), independent of the backend service. Therefore, with the setup described above it should be recreatable anyways.

longwuyuan commented 6 months ago

My point was that you terminate on F5 and then establish a new connection from F5 to ingress-nginx. This is not tested in the project and the relevance here is that there is no official stand on how your F5 works for Layer3/4 and Layer7. Nobody in the project can or will test this for you or try to reproduce it for you.

Other users of F5 in front of NodePort service of ingress-nginx-controller may know better and comment here.

ChristianBieri1995 commented 6 months ago

Got it, let's see if there is someone who can relate to the issue stated.

ChristianBieri1995 commented 6 months ago

My point was that you terminate on F5 and then establish a new connection from F5 to ingress-nginx. This is not tested in the project and the relevance here is that there is no official stand on how your F5 works for Layer3/4 and Layer7. Nobody in the project can or will test this for you or try to reproduce it for you.

Other users of F5 in front of NodePort service of ingress-nginx-controller may know better and comment here.

Just to add on that:

longwuyuan commented 5 months ago

Sorry if my comments are not clear. The way I understand this is ;

ChristianBieri1995 commented 5 months ago

Sorry if my comments are not clear. The way I understand this is ;

  • You use F5 at the edge and a service of type NodePort for the controller. But the project CI does not test this kind of use-case. So how a HTTP/HTTPS requests behaves in this use-case is unknown. Its possible to make guesses but comments based on guesses is not appropriate since you have a problem that needs to be solved.

Sounds legit and is totally understandable

  • Its great to see multiple ways the controller is being used but there is no data to look up in the CI logs related to this use-case and its not possible to test this or reproduce this in the project. Maybe other users of F5+NodePort can comment and thus add value

Let's see if there is anyone in the community, using F5, NodePort Service and Nginx ingress controller

  • Its not possible to discuss the connection to the controller at a port like 30000, coming from a F5 because how the F5 handles Layer3/4/7 traffic and terminates HTTP/HTTPS on a host's NodePort like 30000, for the same reason that its not tested in CI and may or may not relate to errors/problems, occuring beyond that NodePort 30000 of the host tcp/ip stack.
  • The different LoadBalancer like metallb provisions an external-IP for the service --type LoadBalancer created by the ingress-nginx controller. Without diving deeper, this is a marked different networking then your use-case, albeit there are similarities in L3/4 routing. Point being its possible to comment on metallb use-case as how HTTP/HTTPS requests work is known. But its hard to comment on this F5 use case as I don't know how that works in detail for HTTP/HTTPS

Got it, thank you for clarification!

  • The base of investigation here is error code being 502. So all the above factors influence who sent the 502. The controller logs may have some 502 indications so based on that a live check has to be done for the upstream that sent that 502. And once again its possible to guess that there could have been a momentary resource-starvation to see a 502 but only you can check that live, when the 502 occurs. The resources involved could be CPU, Memory, Inodes, Conntrack etc. Or the backend pod could have been just too busy processing and delayed its response. Here the point being if there is a fault in the controller then it should be reproducible on a minikube or a kind cluster. So I can't reproduce 502 on a normal test and the CI does not show 502 in the e2e-tests. Hence am not certain where we go from here

Already checked resource-starvation and backend pod processing. I fully understand the notes you made about CI and reproduction though. I guess the only thing we can do right now is to wait for someone who knows more about F5 + NodePort + Nginx ingress controller

longwuyuan commented 5 months ago

@ChristianBieri1995 there is no data here that points at a problem to be solved in the ingress-nginx controller. Additionally I know that the 502s are reported by others in other issues and its come from timeouts or backend-pod design or load etc. The controller is helpless if the upstream responds with a 502.

Also, I think you will get more eyes of F5 users if you message in the Kubernetes Slack in channels like the ingress-nginx-users or plain old kubernetes-users.

We have 490ish issues open and not all are tracking a problem to be solved in the ingress-nginx controller per se. So I will close the issue for now but feel-free to re-open the issue when there is data posted here that indicates work to be done on the controller itself. It will help manage issue better for us.

/close

k8s-ci-robot commented 5 months ago

@longwuyuan: Closing this issue.

In response to [this](https://github.com/kubernetes/ingress-nginx/issues/11244#issuecomment-2084717446): >@ChristianBieri1995 there is no data here that points at a problem to be solved in the ingress-nginx controller. Additionally I know that the 502s are reported by others in other issues and its come from timeouts or backend-pod design or load etc. The controller is helpless if the upstream responds with a 502. > >Also, I think you will get more eyes of F5 users if you message in the Kubernetes Slack in channels like the ingress-nginx-users or plain old kubernetes-users. > >We have 490ish issues open and not all are tracking a problem to be solved in the ingress-nginx controller per se. So I will close the issue for now but feel-free to re-open the issue when there is data posted here that indicates work to be done on the controller itself. It will help manage issue better for us. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
ChristianBieri1995 commented 2 months ago

Dear @longwuyuan,

Finally, we were able to solve the issue (we think so at least) which arose because of a timeout mismatch (when reusing sessions, HTTP 1.1) between the Nginx ingress controller and our OpenLiberty backend.

Within the Nginx ingress controller, the following (relevant) parameters (related to reusage of sessions) are set: keepalive_timeout 75s; keepalive_requests 1000;

In the OpenLiberty backend, the "persistTimeout" configuration was set to 60s.

As a result, whenever a session which was not used > "persistTimeout" is being reused by the Nginx ingress controller, the backend is no longer aware of these sessions (the session was prematurely closed). Even though you indicated multiple times that this is not an issue of the Nginx ingress controller, I am not totally sure about it. Should the Nginx ingress controller not be informed if sessions are being closed by the OpenLiberty backend, thus, the session not being reused by the Nginx ingress controller? As far as I know, we are not the only victims of this (in my view faulty) behaviour.

I am really looking forward to your feedback.

zeppelinux commented 1 month ago

I'm getting very similar error: [error] 45#45: *12827043 upstream prematurely closed connection while reading response header from upstream

My Nginx ingress controller communicates (sending POST request) with the Wordpress/Printful WooCommerce Integration/3.0.0 backend