kubernetes / ingress-nginx

Ingress-NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.28k stars 8.21k forks source link

keepalive value in custom upstream blocks #9479

Closed owaisrehman closed 1 week ago

owaisrehman commented 1 year ago

Hi,

I am working on a use case where I need to add custom upstream blocks. keepalive value that is used in upstream_balancer block is set by default to 320 but I know it can be configurable through configmap.

My question is that what are the implication of setting this number to 40 for more than 1 upstream blocks? I need to add 7-10 upstream blocks and I wanted to ask for best practice while setting this number.

I've read this article where it is recommended to set keepalive to twice the number of servers mentioned in upstream block (which in my case is 1 as I am adding a single server per upstream block).

But setting keepalive value to 2 does not resolve issue for me because under heavy load nginx fails to connect to the server mentioned in these blocks with HTTP 502 to client.

I've also used retries configuration through max_fails but still some request fails. I've also noticed that by increasing keepalive value, less number of requests fails. Just to mention I want this nginx ingress controller to connect to other servers (in other clusters) over HTTPS. Attached below is an example of what my upstream blocks looks like.


upstream example.a.com {
       server example.a.com:443 max_fails=3 fail_timeout=2s;
       keepalive 10;
       keepalive_timeout 300s;
 }
upstream example.b.com {
       server example.b.com:443 max_fails=3 fail_timeout=2s;
       keepalive 10;
       keepalive_timeout 300s;
}
upstream example.c.com {
       server example.c.com:443 max_fails=3 fail_timeout=2s;
       keepalive 10;
       keepalive_timeout 300s;
}

NGINX Ingress controller version: v1.1.3

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.0", GitCommit:"a866cbe2e5bbaa01cfd5e969aa3e033f3282a8a2", GitTreeState:"clean", BuildDate:"2022-08-23T17:44:59Z", GoVersion:"go1.19", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.6", GitCommit:"6c23b67c202a4cfa7c76c3e1b370bd5f0e654f30", GitTreeState:"clean", BuildDate:"2022-11-09T17:13:23Z", GoVersion:"go1.18.6", Compiler:"gc", Platform:"linux/amd64"}
k8s-ci-robot commented 1 year ago

@owaisrehman: This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
longwuyuan commented 1 year ago

You can get more thoughts going if you add a kubectl describe output of the ingress you have configured to achieve this or the said configmap etc etc

owaisrehman commented 1 year ago

Thanks for replying!

kubectl describe configmap

Name:         ingress-nginx-controller
Namespace:    ingress-nginx
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=ingress-nginx
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/version=0.48.1
              helm.sh/chart=ingress-nginx-3.34.0
Annotations:  <none>

Data
====
http-snippet:
----
upstream google.com {
  server google.com:443 max_fails=3 fail_timeout=2s;
  keepalive_timeout 300s;
  keepalive 10;
}
upstream github.com {
  server github.com:443 max_fails=3 fail_timeout=2s;
  keepalive_timeout 300s;
  keepalive 10;
}

log-format-escape-json:
----
true
log-format-upstream:
----
{"local_time": "$time_local","request_id": "$http_x_correlation_id", "request_time": $request_time,"Service-Addr": "$upstream_addr","upstream_status": "$upstream_status","request_length":"$request_length"}, "upstream_connect_time": "$upstream_connect_time",  "upstream_response_time": "$upstream_response_time", "realip": "$realip_remote_addr"

BinaryData
====

Events:
  Type    Reason  Age               From                      Message
  ----    ------  ----              ----                      -------
  Normal  UPDATE  5s (x3 over 16m)  nginx-ingress-controller  ConfigMap ingress-nginx/ingress-nginx-controller

kubectl describe ingress

Name:             backend-ingress
Labels:           <none>
Namespace:        <test-namespace>
Address:          <ip address>
Ingress Class:    <none>
Default backend:  <default>
Rules:
  Host                                              Path  Backends
  ----                                              ----  --------
  <dnsHostName>.com
                                                    /hello   echo:80 (10.244.0.13:8080)
Annotations:                                        kubernetes.io/ingress.class: nginx
                                                    nginx.ingress.kubernetes.io/configuration-snippet: error_log /etc/nginx/ngxconf.log debug;
                                                    nginx.ingress.kubernetes.io/proxy-next-upstream-tries: 1
                                                    nginx.ingress.kubernetes.io/proxy-read-timeout: 120
                                                    nginx.ingress.kubernetes.io/proxy-request-buffering: off
                                                    nginx.ingress.kubernetes.io/rewrite-target: /$1
                                                    nginx.ingress.kubernetes.io/server-snippet:
                                                      location ~* "^/test/" {

                                                         set $namespace      "test-namespace";
                                                         set $ingress_name   "test-ingress";
                                                         set $service_name   "test-svc";
                                                         set $service_port   "80";
                                                         set $location_path  "/test/";
                                                         set $global_rate_limit_exceeding n;

                                                         rewrite_by_lua_block {
                                                                 lua_ingress.rewrite({
                                                                         force_ssl_redirect = false,
                                                                         ssl_redirect = true,
                                                                         force_no_ssl_redirect = false,
                                                                         preserve_trailing_slash = false,
                                                                         use_port_in_redirects = false,
                                                                         global_throttle = { namespace = "", limit = 0, window_size = 0, key = { }, ignored_cidrs = { } },
                                                                 })
                                                                 balancer.rewrite()
                                                                 plugins.run()
                                                         }

                                                         # be careful with `access_by_lua_block` and `satisfy any` directives as satisfy any
                                                         # will always succeed when there's `access_by_lua_block` that does not have any lua code doing `ngx.exit(ngx.DECLINED)`
                                                         # other authentication method such as basic auth or external auth useless - all requests will be allowed.
                                                         #access_by_lua_block {
                                                         #}

                                                         header_filter_by_lua_block {
                                                                 lua_ingress.header()
                                                                 plugins.run()
                                                         }

                                                         body_filter_by_lua_block {
                                                                 plugins.run()
                                                         }

                                                         log_by_lua_block {
                                                                 balancer.log()

                                                                 monitor.call()

                                                                 plugins.run()
                                                         }

                                                         port_in_redirect off;

                                                         set $balancer_ewma_score -1;
                                                         set $proxy_upstream_name "test-namespace-test-svc-80";
                                                         set $proxy_host          $proxy_upstream_name;
                                                         set $pass_access_scheme  $scheme;

                                                         set $pass_server_port    $server_port;

                                                         set $best_http_host      $http_host;
                                                         set $pass_port           $pass_server_port;

                                                         set $proxy_alternative_upstream_name "";

                                                         auth_request        /_internal-auth;
                                                         auth_request_set    $auth_cookie $upstream_http_set_cookie;
                                                         add_header          Set-Cookie $auth_cookie;
                                                         # Following annotation is used to get response headers from auth request and set them in variables
                                                         auth_request_set $serviceurl $upstream_http_service_url;

                                                         client_max_body_size                    1m;

                                                         proxy_set_header Host                   $best_http_host;

                                                         # Pass the extracted client certificate to the backend

                                                         # Allow websocket connections
                                                         proxy_set_header                        Upgrade           $http_upgrade;

                                                         proxy_set_header                        Connection        $connection_upgrade;

                                                         proxy_set_header X-Request-ID           $req_id;
                                                         proxy_set_header X-Real-IP              $remote_addr;

                                                         proxy_set_header X-Forwarded-For        $remote_addr;

                                                         proxy_set_header X-Forwarded-Host       $best_http_host;
                                                         proxy_set_header X-Forwarded-Port       $pass_port;
                                                         proxy_set_header X-Forwarded-Proto      $pass_access_scheme;
                                                         proxy_set_header X-Forwarded-Scheme     $pass_access_scheme;

                                                         proxy_set_header X-Scheme               $pass_access_scheme;

                                                         # Pass the original X-Forwarded-For
                                                         proxy_set_header X-Original-Forwarded-For $http_x_forwarded_for;

                                                         # mitigate HTTPoxy Vulnerability
                                                         # https://www.nginx.com/blog/mitigating-the-httpoxy-vulnerability-with-nginx/
                                                         proxy_set_header Proxy                  "";

                                                         # Custom headers to proxied server

                                                         proxy_connect_timeout                   5s;
                                                         proxy_send_timeout                      60s;
                                                         proxy_read_timeout                      120s;

                                                         proxy_buffering                         off;
                                                         proxy_buffer_size                       4k;
                                                         proxy_buffers                           4 4k;

                                                         proxy_max_temp_file_size                1024m;

                                                         proxy_request_buffering                 off;
                                                         proxy_http_version                      1.1;

                                                         proxy_cookie_domain                     off;
                                                         proxy_cookie_path                       off;

                                                         # In case of errors try the next upstream server before returning an error
                                                         proxy_next_upstream                     error timeout;
                                                         proxy_next_upstream_timeout             0;
                                                         proxy_next_upstream_tries               1;

                                                         error_log /etc/nginx/ngxconf.log debug;

                                                         rewrite "(?i)/test/" /$1 break;
                                                         proxy_pass $serviceurl;

                                                         proxy_redirect                          off;

                                                       }
Events:
  Type    Reason  Age                From                      Message
  ----    ------  ----               ----                      -------
  Normal  Sync    36s (x7 over 41h)  nginx-ingress-controller  Scheduled for sync
longwuyuan commented 1 year ago

Thanks. You are limiting the info (with/without intent) that is narrowing down the number of folks who can look and work on this.

I was hinting at the whole picture like ;

You are seeking best-pracice info but such a complicated config and use-case is not widely discussed or documented. So it will take some amount of helpful nature of this issue content like clear description with small tiny details, current problem with known config and intended goal's description.

On the other view, there is so much config here and no request/response data that it will not be easy to commit to saying what values for keepalive and timeouts etc will fit well.

Personally, I am not even clear yet on the use case of so much customization. I or others could make some comments hopefully, if there was elaborate information on what exactly does the default controller not provide for your use-case.

owaisrehman commented 1 year ago

Sorry for limiting information unintentionally.

My usecase is to send auth-request to an internal service and receive a response header service-url (as given in above ingress) from auth service. Based on the value of this header, I want to route the traffic to services in same cluster (over http) or to another cluster (over https). While sending request to other clusters, initially I didn't add any upstream blocks and experienced alot of connection loss with following error [error] 39#39: *32529186 peer closed connection in SSL handshake while SSL handshaking to upstream

After searching online I got an understanding that under load, SSL handshake is taking alot of time and it's good practice to keep alive underlying connection for some time.

This was the main reason for adding the upstream block.

So I added upstream blocks as mentioned in previous comments.

But I still get same error, although the number of errors are relatively quite low.

One thing I noticed is that by increasing keepalive value the numbers of error that I get is reduced. What I want to know is that what should be the value of keepalive in my case?

P.S. The reason for this much customization is to achieve following functionality:

  1. Dynamically route traffic to internal services or service in another cluster
  2. Return custom response (response body and response codes other than what auth-request module offers) from auth-service when auth-request fails
longwuyuan commented 1 year ago

Thanks for the details. It helps know context.

But now I have to ask a dumb question (apologies). Its important to explicitly know if you are completely aware of the annotations and other fields of the ingress object, which impact your goal and some are even directly related to what you are trying to achieve.

owaisrehman commented 1 year ago

No worries!

Yes I am aware of the annotations that can be used in ingress object. But as I discussed earlier there are some cases that are not covered using these annotations (brief description given below).

Correct me if I am wrong, above mentioned changes cannot be achieved through annotations only. For this I added a server-snippet annotation in ingress and added custom location blocks that I can control easily.

Anyways, If you know a way we can achieve this through annotations please let me know about the details.

longwuyuan commented 1 year ago

Thanks. More clear now.

And so apologies, I don't know of a better way or the best practices. There is a keepalive-timeout also visible nginx.ingress.kubernetes.io/auth-keepalive-timeout: <Timeout> but again I have not used it and I am not convinced if all this is an option available for your use-case. But worth looking at and playing with

owaisrehman commented 1 year ago

Thanks for this.

But main question remains.

what are the consequences of making keepalive value large? What other mitigation can I apply to counter this issue? [error] 39#39: *32529186 peer closed connection in SSL handshake while SSL handshaking to upstream

longwuyuan commented 1 week ago

Hi, Reading this after so many months and trying to clean up on the project side.

The current thoughts is that first and foremost this statement is not supported ;

Just to mention I want this nginx ingress controller to connect to other servers (in other clusters) over HTTPS.

The upstream menioned in the backend spec of a ingress is a service in the same namespace as the ingress object. It can not even be a service in another namespace on the same cluster. That was just to state the obvious.

So next aspect after stating the obvious above is that you are using snippets for setting upstream names. The relevance here is that snippets are getting deprecated soon so this issue is no longer a topic of discussion. This issue has to be closed as there is no more context.

Lastly the far-fatched use-case some users have introduced are services of --type externalName as backends in the ingress resource. So now we are clear that regardless of far-away upstream configured vis snippets or via externalName, there is no easy way to deal with that connection because snippets are out now and externalName is a DNS resolution so a branch new connection is established to that far-away upstream outside of the K8S cluster. Thus there are no known best-practices for timeout configs and its all related to the trials and testing of each use-case environment.

Closing this issue for now as it adds to the tally of open issues without tracking any action item on the project. And the lack of resources has made the project on work like Gateway-API implementation, security and deprecating non Ingress-API features.

/close

k8s-ci-robot commented 1 week ago

@longwuyuan: Closing this issue.

In response to [this](https://github.com/kubernetes/ingress-nginx/issues/9479#issuecomment-2338052759): >Hi, >Reading this after so many months and trying to clean up on the project side. > >The current thoughts is that first and foremost this statement is not supported ; >``` >Just to mention I want this nginx ingress controller to connect to other servers (in other clusters) over HTTPS. >``` >The upstream menioned in the backend spec of a ingress is a service in the same namespace as the ingress object. It can not even be a service in another namespace on the same cluster. That was just to state the obvious. > >So next aspect after stating the obvious above is that you are using snippets for setting upstream names. The relevance here is that snippets are getting deprecated soon so this issue is no longer a topic of discussion. This issue has to be closed as there is no more context. > >Lastly the far-fatched use-case some users have introduced are services of --type externalName as backends in the ingress resource. So now we are clear that regardless of far-away upstream configured vis snippets or via externalName, there is no easy way to deal with that connection because snippets are out now and externalName is a DNS resolution so a branch new connection is established to that far-away upstream outside of the K8S cluster. Thus there are no known best-practices for timeout configs and its all related to the trials and testing of each use-case environment. > >Closing this issue for now as it adds to the tally of open issues without tracking any action item on the project. And the lack of resources has made the project on work like Gateway-API implementation, security and deprecating non Ingress-API features. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.