kubernetes / ingress-nginx

Ingress-NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
16.99k stars 8.15k forks source link

BUG: --enable-dynamic-configuration=true doesn't work properly #2225

Closed dcherniv closed 6 years ago

dcherniv commented 6 years ago

NGINX Ingress controller version: 0.12.0

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.5", GitCommit:"f01a2bf98249a4db383560443a59bed0c13575df", GitTreeState:"clean", BuildDate:"2018-03-19T15:59:24Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"9+", GitVersion:"v1.9.2-gke.1", GitCommit:"4ce7af72d8d343ea2f7680348852db641ff573af", GitTreeState:"clean", BuildDate:"2018-01-31T22:30:55Z", GoVersion:"go1.9.2b4", Compiler:"gc", Platform:"linux/amd64"}

Environment: GCP

What happened: nginx lua backend.lua crashes

What you expected to happen: nginx controller to work

How to reproduce it (as minimally and precisely as possible): update image from 0.10.2 to 0.12.0 Run nginx with following options:

      containers:
      - args:
        - /nginx-ingress-controller
        - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
        - --configmap=$(POD_NAMESPACE)/nginx-configuration
        - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
        - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
        - --publish-service=$(POD_NAMESPACE)/ingress-nginx
        - --enable-dynamic-configuration=true

Anything else we need to know:

REDACTED - [REDACTED] - - [21/Mar/2018:02:11:20 +0000] "GET /REDACTED/REDACTED/healthz HTTP/1.1" 500 193 "-" "GoogleHC/1.0" 142 0.000 [REDACTED-service-80]  0 - -
2018/03/21 02:11:20 [error] 326#326: *92 failed to run balancer_by_lua*: /etc/nginx/lua/balancer.lua:31: attempt to index local 'backend' (a nil value)
stack traceback:
        /etc/nginx/lua/balancer.lua:31: in function 'balance'
        /etc/nginx/lua/balancer.lua:97: in function 'call'
        balancer_by_lua:2: in function <balancer_by_lua:1> while connecting to upstream, client: REDACTED, server: REDACTED, request: "GET /REDACTED-service/healthz HTTP/1.1", host: "REDACTED"
aledbf commented 6 years ago

@ElvinEfendi ping

aledbf commented 6 years ago

@dcherniv this happens when the ingress controller starts?

aledbf commented 6 years ago

@dcherniv can you post the pod logs?

ElvinEfendi commented 6 years ago

It looks like the backends configuration don't get POSTed to the Lua endpoint successfully. Without any further logs my best guess is this is related to https://github.com/kubernetes/ingress-nginx/pull/2210 and fixed in that PR. tldr; is your pod probably has IPv6 enabled and therefore when controller sends POST request it gets blocked by Nginx(incorrectly).

@dcherniv do you see "Unexpected error code: 403" in the logs?

@aledbf do you have a temporal image that includes https://github.com/kubernetes/ingress-nginx/pull/2210? So that @dcherniv can try, otherwise I can provide one under my Docker account.

ElvinEfendi commented 6 years ago

@dcherniv this happens when the ingress controller starts?

@aledbf this happens when Nginx/Lua is processing a request. We can definitely catch the case and log something more meaningful instead of letting it fail with attempt to index nil value

aledbf commented 6 years ago

@aledbf do you have a temporal image that includes #2210? So that @dcherniv can try, otherwise I can provide one under my Docker account.

Current master: quay.io/aledbf/nginx-ingress-controller:0.346

aledbf commented 6 years ago

@aledbf this happens when Nginx/Lua is processing a request. We can definitely catch the case and log something more meaningful instead of letting it fail with attempt to index nil value

I understand the error. I am trying to find out if this happens before the first sync of the configuration. The "correct" way to ensure this did not happen is to add another call to nginx here to make sure the kubernetes probes only pass after the initial sync.

mikksoone commented 6 years ago

I have the same issue with both 0.12 and latest master (0.346). Pods are IPv4 and the issue does not go away after waiting for some time (referring to the "first" sync).

ElvinEfendi commented 6 years ago

@aledbf I like that idea! I'll try to address it sometime this week. In the meantime it would be great to get more logs here.

@mikksoone @dcherniv if you see the same issue even with the latest master please provide some more logs. Enable --v=2 then capture the logs from the start until after dynamic reconfiguration succeeded, skipping reload or falling back to reload, could not dynamically reconfigure messages appear. Then send a request to your application to see if it works. Next while ingress-nginx is running, increase the number of replicas for your application and capture corresponding logs in ingress-nginx. You are expected to see something with posting backends configuration in the logs. Please also provide logs from there until after you see either of dynamic reconfiguration succeeded, skipping reload or falling back to reload, could not dynamically reconfigure messages. Then send a request to your application to see if it works.

dcherniv commented 6 years ago

@aledbf @mikksoone Apologies for the delay. Here's the startup log right after nginx controller restart. Of particular interest i think is the 403 on the POST to /configuration/backends. untitled

ElvinEfendi commented 6 years ago

@dcherniv have you tested with quay.io/aledbf/nginx-ingress-controller:0.346(current master)? The logs you posted above confirms what I said at https://github.com/kubernetes/ingress-nginx/issues/2225#issuecomment-374821266, i.e it should have already been fixed in master branch.

mikksoone commented 6 years ago

Logs from previous run, will try the --v2 later today.

-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:    0.12.0
  Build:      git-0398c410
  Repository: https://github.com/aledbf/ingress-nginx
-------------------------------------------------------------------------------

W0321 12:31:13.510947       7 client_config.go:529] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0321 12:31:13.511138       7 main.go:181] Creating API client for https://*:443
I0321 12:31:13.520161       7 main.go:225] Running in Kubernetes Cluster version v1.7 (v1.7.10) - git (clean) commit bebdeb749f1fa3da9e1312c4b08e439c404b3136 - platform linux/amd64
I0321 12:31:13.522706       7 main.go:84] validated nginx/nginx-default-backend as the default backend
I0321 12:31:13.524951       7 main.go:105] service nginx/ingress-nginx validated as source of Ingress status
I0321 12:31:13.856143       7 stat_collector.go:77] starting new nginx stats collector for Ingress controller running in namespace  (class nginx)
I0321 12:31:13.856167       7 stat_collector.go:78] collector extracting information from port 18080
I0321 12:31:13.873841       7 nginx.go:281] starting Ingress controller
I0321 12:31:14.979687       7 event.go:218] Event(v1.ObjectReference{Kind:"Ingress"....
...
I0321 12:31:15.083937       7 nginx.go:302] starting NGINX process...
I0321 12:31:15.088448       7 leaderelection.go:174] attempting to acquire leader lease...
I0321 12:31:15.096478       7 status.go:196] new leader elected: ingress-nginx-3598309345-r1kx3
I0321 12:31:15.096719       7 controller.go:183] backend reload required
I0321 12:31:15.096773       7 stat_collector.go:34] changing prometheus collector from  to vts
I0321 12:31:15.262561       7 controller.go:192] ingress backend successfully reloaded...
I0321 12:31:16.035923       7 backend_ssl.go:174] updating local copy of ssl certificate ...
I0321 12:31:16.197907       7 backend_ssl.go:174] updating local copy of ssl certificate ...
2018/03/21 12:31:16 [warn] 35#35: *8 a client request body is buffered to a temporary file /var/lib/nginx/body/0000000001, client: ::1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
%v - [] - - [21/Mar/2018:12:31:16 +0000] "POST /configuration/backends HTTP/1.1" 201 5 "-" "Go-http-client/1.1" 12908 0.000 [-] - - - - b2ee530193b3db7966543e4fa54dcaf9
I0321 12:31:16.265917       7 controller.go:202] dynamic reconfiguration succeeded
...
I0321 12:31:17.191738       7 controller.go:183] backend reload required
I0321 12:31:17.438136       7 controller.go:192] ingress backend successfully reloaded...
2018/03/21 12:31:29 [error] 175#175: *71 failed to run balancer_by_lua*: /etc/nginx/lua/balancer.lua:31: attempt to index local 'backend' (a nil value)
stack traceback:
    /etc/nginx/lua/balancer.lua:31: in function 'balance'
    /etc/nginx/lua/balancer.lua:97: in function 'call'
    balancer_by_lua:2: in function <balancer_by_lua:1> while connecting to upstream, client: *, server: *, request: "GET /ping HTTP/1.1", host: "*"
ElvinEfendi commented 6 years ago

@mikksoone the issue you're seeing seems to be different. When you test later today, please also set error-log-level: info in the configmap. In addition it would be useful to see the generated Nginx server block for the app under consideration and output of curl localhost:18080/configuration/backends while you are on a running ingress-nginx pod.

dcherniv commented 6 years ago

@ElvinEfendi Latest master image works for me.

I0321 16:09:31.080649       9 controller.go:202] dynamic reconfiguration succeeded
::1 - [::1] - - [21/Mar/2018:16:09:31 +0000] "POST /configuration/backends HTTP/1.1" 201 5 "-" "Go-http-client/1.1" 37820 0.000 [-] - - - -
I0321 16:09:31.265580       9 controller.go:192] ingress backend successfully reloaded...
mbugeia commented 6 years ago

I was hitting the same bug and I also confirm that it works with quay.io/aledbf/nginx-ingress-controller:0.346

mikksoone commented 6 years ago

curl localhost:18080/configuration/backends nil

More logs:

NGINX Ingress controller
  Release:    0.12.0
  Build:      git-0398c410
  Repository: https://github.com/aledbf/ingress-nginx
-------------------------------------------------------------------------------

W0321 19:50:09.284853       7 client_config.go:529] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0321 19:50:09.285014       7 main.go:181] Creating API client for https://REDUCTED:443
I0321 19:50:09.285641       7 main.go:201] trying to discover Kubernetes version
I0321 19:50:09.294493       7 main.go:225] Running in Kubernetes Cluster version v1.7 (v1.7.10) - git (clean) commit bebdeb749f1fa3da9e1312c4b08e439c404b3136 - platform linux/amd64
I0321 19:50:09.297059       7 main.go:84] validated nginx/nginx-default-backend as the default backend
I0321 19:50:09.299614       7 main.go:105] service nginx/ingress-nginx validated as source of Ingress status
I0321 19:50:09.549228       7 stat_collector.go:77] starting new nginx stats collector for Ingress controller running in namespace  (class nginx)
I0321 19:50:09.549244       7 stat_collector.go:78] collector extracting information from port 18080
I0321 19:50:09.560763       7 nginx.go:281] starting Ingress controller
I0321 19:50:09.572054       7 store.go:404] adding configmap nginx/ingress-nginx to backend
I0321 19:50:10.677349       7 event.go:218] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"app-bff-staging", Name:"app-bff-ingress", UID:"5f491870-20bc-11e8-889c-0a91ad1abda6", APIVersion:"extensions", ResourceVersion:"39561205", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress app-bff-staging/app-bff-ingress
...
I0321 19:50:10.762101       7 store.go:614] running initial sync of secrets
I0321 19:50:10.765779       7 backend_ssl.go:68] adding secret app-bff-staging/app-bff-staging-tls to the local store
I0321 19:50:10.767969       7 nginx.go:302] starting NGINX process...
I0321 19:50:10.768770       7 controller.go:183] backend reload required
I0321 19:50:10.768804       7 stat_collector.go:34] changing prometheus collector from  to vts
I0321 19:50:10.769177       7 util.go:64] system fs.file-max=2097152
I0321 19:50:10.769227       7 nginx.go:560] maximum number of open file descriptors : 523264
I0321 19:50:10.775212       7 leaderelection.go:174] attempting to acquire leader lease...
I0321 19:50:10.792038       7 status.go:196] new leader elected: ingress-nginx-2742315815-3kwgn
I0321 19:50:10.903607       7 nginx.go:658] NGINX configuration diff
I0321 19:50:10.903784       7 nginx.go:659] --- /etc/nginx/nginx.conf   2018-03-21 03:31:50.000000000 +0000
+++ /tmp/new-nginx-cfg724125397 2018-03-21 19:50:10.893639217 +0000
@@ -1,6 +1,1744 @@
-# A very simple nginx configuration file that forces nginx to start.
+
+daemon off;
+
+worker_processes 4;
+
 pid /run/nginx.pid;

-events {}
-http {}
-daemon off;
\ No newline at end of file
+worker_rlimit_nofile 523264;
+
+worker_shutdown_timeout 10s ;
+
+events {
+    multi_accept        on;
+    worker_connections  16384;
+    use                 epoll;
+}
+
+http {
+    lua_package_cpath "/usr/local/lib/lua/?.so;/usr/lib/x86_64-linux-gnu/lua/5.1/?.so;;";
+    lua_package_path "/etc/nginx/lua/?.lua;/etc/nginx/lua/vendor/?.lua;/usr/local/lib/lua/?.lua;;";
+
+    lua_shared_dict configuration_data 5M;
+    lua_shared_dict round_robin_state 1M;
+    lua_shared_dict locks 512k;
+
+    init_by_lua_block {
+        require("resty.core")
+        collectgarbage("collect")
+
+        -- init modules
+        local ok, res
+
+        ok, res = pcall(require, "configuration")
+        if not ok then
+          error("require failed: " .. tostring(res))
+        else
+          configuration = res
+        end
+
+        ok, res = pcall(require, "balancer")
+        if not ok then
+          error("require failed: " .. tostring(res))
+        else
+          balancer = res
+        end
+    }
+
+    init_worker_by_lua_block {
+        balancer.init_worker()
+    }
+
+    real_ip_header      proxy_protocol;
+
+    real_ip_recursive   on;
+
+    set_real_ip_from    0.0.0.0/0;
+
+    geoip_country       /etc/nginx/geoip/GeoIP.dat;
+    geoip_city          /etc/nginx/geoip/GeoLiteCity.dat;
+    geoip_org           /etc/nginx/geoip/GeoIPASNum.dat;
+    geoip_proxy_recursive on;
+
+    vhost_traffic_status_zone shared:vhost_traffic_status:10m;
+    vhost_traffic_status_filter_by_set_key $server_name;
+
+    aio                 threads;
+    aio_write           on;
+
+    tcp_nopush          on;
+    tcp_nodelay         on;
+
+    log_subrequest      on;
+
+    reset_timedout_connection on;
+
+    keepalive_timeout  75s;
+    keepalive_requests 100;
+
+    client_header_buffer_size       1k;
+    client_header_timeout           60s;
+    large_client_header_buffers     4 8k;
+    client_body_buffer_size         8k;
+    client_body_timeout             60s;
+
+    http2_max_field_size            4k;
+    http2_max_header_size           16k;
+
+    types_hash_max_size             2048;
+    server_names_hash_max_size      1024;
+    server_names_hash_bucket_size   64;
+    map_hash_bucket_size            64;
+
+    proxy_headers_hash_max_size     512;
+    proxy_headers_hash_bucket_size  64;
+
+    variables_hash_bucket_size      128;
+    variables_hash_max_size         2048;
+
+    underscores_in_headers          off;
+    ignore_invalid_headers          on;
+
+    limit_req_status                503;
+
+    include /etc/nginx/mime.types;
+    default_type text/html;
+
+    gzip on;
+    gzip_comp_level 5;
+    gzip_http_version 1.1;
+    gzip_min_length 256;
+    gzip_types application/atom+xml application/javascript application/x-javascript application/json application/rss+xml application/vnd.ms-fontobject application/x-font-ttf application/x-web-app-manifest+json application/xhtml+xml application/xml font/opentype image/svg+xml image/x-icon text/css text/plain text/x-component;
+    gzip_proxied any;
+    gzip_vary on;
+
+    # Custom headers for response
+
+    server_tokens on;
+
+    # disable warnings
+    uninitialized_variable_warn off;
+
+    # Additional available variables:
+    # $namespace
+    # $ingress_name
+    # $service_name
+    log_format upstreaminfo '%v - [$the_real_ip] - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $request_length $request_time [$proxy_upstream_name] $upstream_addr $upstream_response_length $upstream_response_time $upstream_status $request_id';
+
+    map $request_uri $loggable {
+
+        default 1;
+    }
+
+    access_log /var/log/nginx/access.log upstreaminfo if=$loggable;
+
+    error_log  /var/log/nginx/error.log info;
+
+    resolver REDUCTED valid=30s;
+
+    # Retain the default nginx handling of requests without a "Connection" header
+    map $http_upgrade $connection_upgrade {
+        default          upgrade;
+        ''               close;
+    }
+
+    map $http_x_forwarded_for $the_real_ip {
+
+        # Get IP address from Proxy Protocol
+        default          $proxy_protocol_addr;
+
+    }
+
+    # trust http_x_forwarded_proto headers correctly indicate ssl offloading
+    map $http_x_forwarded_proto $pass_access_scheme {
+        default          $http_x_forwarded_proto;
+        ''               $scheme;
+    }
+
+    # validate $pass_access_scheme and $scheme are http to force a redirect
+    map "$scheme:$pass_access_scheme" $redirect_to_https {
+        default          0;
+        "http:http"      1;
+        "https:http"     1;
+    }
+
+    map $http_x_forwarded_port $pass_server_port {
+        default           $http_x_forwarded_port;
+        ''                $server_port;
+    }
+
+    map $pass_server_port $pass_port {
+        443              443;
+        default          $pass_server_port;
+    }
+
+    # Obtain best http host
+    map $http_host $this_host {
+        default          $http_host;
+        ''               $host;
+    }
+
+    map $http_x_forwarded_host $best_http_host {
+        default          $http_x_forwarded_host;
+        ''               $this_host;
+    }
+
+    server_name_in_redirect off;
+    port_in_redirect        off;
+
+    rewrite_log             on;
+
+    ssl_protocols TLSv1.2;
+
+    # turn on session caching to drastically improve performance
+
+    ssl_session_cache builtin:1000 shared:SSL:10m;
+    ssl_session_timeout 10m;
+
+    # allow configuring ssl session tickets
+    ssl_session_tickets on;
+
+    # slightly reduce the time-to-first-byte
+    ssl_buffer_size 4k;
+
+    # allow configuring custom ssl ciphers
+    ssl_ciphers 'ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256';
+    ssl_prefer_server_ciphers on;
+
+    ssl_ecdh_curve auto;
+
+    proxy_ssl_session_reuse on;
+
+    upstream upstream_balancer {
+        server 0.0.0.1; # placeholder
+
+        balancer_by_lua_block {
+          balancer.call()
+        }
+
+        keepalive 1000;
+    }
+
+    ## start server _
+    server {
+        server_name _ ;
+
+        listen 80 proxy_protocol default_server  backlog=511;
+
+        listen [::]:80 proxy_protocol default_server  backlog=511;
+
+        set $proxy_upstream_name "-";
+
+        listen 443 proxy_protocol  default_server  backlog=511 ssl http2;
+
+        listen [::]:443 proxy_protocol  default_server  backlog=511 ssl http2;
+
+        # PEM sha: 06070084cadda62b8096e43ff43f464dfc4c57a7
+        ssl_certificate                         /ingress-controller/ssl/default-fake-certificate.pem;
+        ssl_certificate_key                     /ingress-controller/ssl/default-fake-certificate.pem;
+
+        location / {
+
+            if ($scheme = https) {
+            more_set_headers                        "Strict-Transport-Security: max-age=15724800; includeSubDomains;";
+            }
+
+            access_log off;
+
+            port_in_redirect off;
+
+            set $proxy_upstream_name "upstream-default-backend";
+
+            set $namespace      "";
+            set $ingress_name   "";
+            set $service_name   "";
+
+            client_max_body_size                    "1m";
+
+            proxy_set_header Host                   $best_http_host;
+
+            # Pass the extracted client certificate to the backend
+
+            proxy_set_header ssl-client-cert        "";
+            proxy_set_header ssl-client-verify      "";
+            proxy_set_header ssl-client-dn          "";
+
+            # Allow websocket connections
+            proxy_set_header                        Upgrade           $http_upgrade;
+
+            proxy_set_header                        Connection        $connection_upgrade;
+
+            proxy_set_header X-Real-IP              $the_real_ip;
+
+            proxy_set_header X-Forwarded-For        $the_real_ip;
+
+            proxy_set_header X-Forwarded-Host       $best_http_host;
+            proxy_set_header X-Forwarded-Port       $pass_port;
+            proxy_set_header X-Forwarded-Proto      $pass_access_scheme;
+            proxy_set_header X-Original-URI         $request_uri;
+            proxy_set_header X-Scheme               $pass_access_scheme;
+
+            # Pass the original X-Forwarded-For
+            proxy_set_header X-Original-Forwarded-For $http_x_forwarded_for;
+
+            # mitigate HTTPoxy Vulnerability
+            # https://www.nginx.com/blog/mitigating-the-httpoxy-vulnerability-with-nginx/
+            proxy_set_header Proxy                  "";
+
+            # Custom headers to proxied server
+
+            proxy_connect_timeout                   5s;
+            proxy_send_timeout                      60s;
+            proxy_read_timeout                      60s;
+
+            proxy_buffering                         "off";
+            proxy_buffer_size                       "4k";
+            proxy_buffers                           4 "4k";
+            proxy_request_buffering                 "on";
+
+            proxy_http_version                      1.1;
+
+            proxy_cookie_domain                     off;
+            proxy_cookie_path                       off;
+
+            # In case of errors try the next upstream server before returning an error
+            proxy_next_upstream                     error timeout invalid_header http_502 http_503 http_504;
+
+            proxy_pass http://upstream_balancer;
+
+            proxy_redirect                          off;
+
+        }
+
+        # health checks in cloud providers require the use of port 80
+        location /healthz {
+            access_log off;
+            return 200;
+        }
+
+        # this is required to avoid error if nginx is being monitored
+        # with an external software (like sysdig)
+        location /nginx_status {
+            allow 127.0.0.1;
+            allow ::1;
+            deny all;
+
+            access_log off;
+            stub_status on;
+        }
+
+    }
+    ## end server _
+
+    ## start server REDUCTED
+    server {
+        server_name REDUCTED ;
+
+        listen 80 proxy_protocol;
+
+        listen [::]:80 proxy_protocol;
+
+        set $proxy_upstream_name "-";
+
+        listen 443 proxy_protocol  ssl http2;
+
+        listen [::]:443 proxy_protocol  ssl http2;
+
+        # PEM sha: 61ad3926d647a0de2c0fdad636f6c38e012a9dd3
+        ssl_certificate                         /ingress-controller/ssl/app-bff-staging-app-bff-staging-tls.pem;
+        ssl_certificate_key                     /ingress-controller/ssl/app-bff-staging-app-bff-staging-tls.pem;
+
+        location /.well-known/acme-challenge {
+
+            if ($scheme = https) {
+            more_set_headers                        "Strict-Transport-Security: max-age=15724800; includeSubDomains;";
+            }
+
+            port_in_redirect off;
+
+            set $proxy_upstream_name "kube-lego-kube-lego-nginx-8080";
+
+            set $namespace      "kube-lego";
+            set $ingress_name   "kube-lego-nginx";
+            set $service_name   "kube-lego-nginx";
+
+            client_max_body_size                    "1m";
+
+            proxy_set_header Host                   $best_http_host;
+
+            # Pass the extracted client certificate to the backend
+
+            proxy_set_header ssl-client-cert        "";
+            proxy_set_header ssl-client-verify      "";
+            proxy_set_header ssl-client-dn          "";
+
+            # Allow websocket connections
+            proxy_set_header                        Upgrade           $http_upgrade;
+
+            proxy_set_header                        Connection        $connection_upgrade;
+
+            proxy_set_header X-Real-IP              $the_real_ip;
+
+            proxy_set_header X-Forwarded-For        $the_real_ip;
+
+            proxy_set_header X-Forwarded-Host       $best_http_host;
+            proxy_set_header X-Forwarded-Port       $pass_port;
+            proxy_set_header X-Forwarded-Proto      $pass_access_scheme;
+            proxy_set_header X-Original-URI         $request_uri;
+            proxy_set_header X-Scheme               $pass_access_scheme;
+
+            # Pass the original X-Forwarded-For
+            proxy_set_header X-Original-Forwarded-For $http_x_forwarded_for;
+
+            # mitigate HTTPoxy Vulnerability
+            # https://www.nginx.com/blog/mitigating-the-httpoxy-vulnerability-with-nginx/
+            proxy_set_header Proxy                  "";
+
+            # Custom headers to proxied server
+
+            proxy_connect_timeout                   5s;
+            proxy_send_timeout                      60s;
+            proxy_read_timeout                      60s;
+
+            proxy_buffering                         "off";
+            proxy_buffer_size                       "4k";
+            proxy_buffers                           4 "4k";
+            proxy_request_buffering                 "on";
+
+            proxy_http_version                      1.1;
+
+            proxy_cookie_domain                     off;
+            proxy_cookie_path                       off;
+
+            # In case of errors try the next upstream server before returning an error
+            proxy_next_upstream                     error timeout invalid_header http_502 http_503 http_504;
+
+            proxy_pass http://upstream_balancer;
+
+            proxy_redirect                          off;
+
+        }
+
+        location / {
+
+            if ($scheme = https) {
+            more_set_headers                        "Strict-Transport-Security: max-age=15724800; includeSubDomains;";
+            }
+
+            port_in_redirect off;
+
+            set $proxy_upstream_name "app-bff-staging-app-bff-service-3000";
+
+            set $namespace      "app-bff-staging";
+            set $ingress_name   "app-bff-notls-ingress";
+            set $service_name   "app-bff-service";
+
+            # enforce ssl on server side
+            if ($redirect_to_https) {
+
+                return 308 https://$best_http_host$request_uri;
+
+            }
+
+            client_max_body_size                    "1m";
+
+            proxy_set_header Host                   $best_http_host;
+
+            # Pass the extracted client certificate to the backend
+
+            proxy_set_header ssl-client-cert        "";
+            proxy_set_header ssl-client-verify      "";
+            proxy_set_header ssl-client-dn          "";
+
+            # Allow websocket connections
+            proxy_set_header                        Upgrade           $http_upgrade;
+
+            proxy_set_header                        Connection        $connection_upgrade;
+
+            proxy_set_header X-Real-IP              $the_real_ip;
+
+            proxy_set_header X-Forwarded-For        $the_real_ip;
+
+            proxy_set_header X-Forwarded-Host       $best_http_host;
+            proxy_set_header X-Forwarded-Port       $pass_port;
+            proxy_set_header X-Forwarded-Proto      $pass_access_scheme;
+            proxy_set_header X-Original-URI         $request_uri;
+            proxy_set_header X-Scheme               $pass_access_scheme;
+
+            # Pass the original X-Forwarded-For
+            proxy_set_header X-Original-Forwarded-For $http_x_forwarded_for;
+
+            # mitigate HTTPoxy Vulnerability
+            # https://www.nginx.com/blog/mitigating-the-httpoxy-vulnerability-with-nginx/
+            proxy_set_header Proxy                  "";
+
+            # Custom headers to proxied server
+
+            proxy_connect_timeout                   5s;
+            proxy_send_timeout                      60s;
+            proxy_read_timeout                      60s;
+
+            proxy_buffering                         "off";
+            proxy_buffer_size                       "4k";
+            proxy_buffers                           4 "4k";
+            proxy_request_buffering                 "on";
+
+            proxy_http_version                      1.1;
+
+            proxy_cookie_domain                     off;
+            proxy_cookie_path                       off;
+
+            # In case of errors try the next upstream server before returning an error
+            proxy_next_upstream                     error timeout invalid_header http_502 http_503 http_504;
+
+            proxy_pass http://upstream_balancer;
+
+            proxy_redirect                          off;
+
+        }
+
+    }
+    ## end server REDUCTED
+
...
+    # default server, used for NGINX healthcheck and access to nginx stats
+    server {
+        # Use the port 18080 (random value just to avoid known ports) as default port for nginx.
+        # Changing this value requires a change in:
+        # https://github.com/kubernetes/ingress-nginx/blob/master/controllers/nginx/pkg/cmd/controller/nginx.go
+        listen 18080 default_server  backlog=511;
+        listen [::]:18080 default_server  backlog=511;
+        set $proxy_upstream_name "-";
+
+        location /healthz {
+            access_log off;
+            return 200;
+        }
+
+        location /nginx_status {
+            set $proxy_upstream_name "internal";
+
+            access_log off;
+            stub_status on;
+
+        }
+
+        location /configuration {
+            allow 127.0.0.1;
+
+            allow ::1;
+
+            deny all;
+            content_by_lua_block {
+              configuration.call()
+            }
+        }
+
+        location / {
+
+            set $proxy_upstream_name "upstream-default-backend";
+
+            proxy_pass          http://upstream_balancer;
+
+        }
+
+    }
+}
+
+stream {
+    log_format log_stream [$time_local] $protocol $status $bytes_sent $bytes_received $session_time;
+
+    access_log /var/log/nginx/access.log log_stream;
+
+    error_log  /var/log/nginx/error.log;
+
+    # TCP services
+
+    # UDP services
+
+}
+
I0321 20:06:44.696023       7 controller.go:192] ingress backend successfully reloaded...
I0321 20:06:45.584802       7 backend_ssl.go:174] updating local copy of ssl certificate app-bff-staging/app-bff-staging-tls with missing intermediate CA certs
I0321 20:06:45.697116       7 nginx.go:771] posting backends configuration: [{"name":"app-bff-staging-app-bff-service-3000","service":{"metadata":{"name":"app-bff-service","namespace":"app-bff-staging","selfLink":"/api/v1/namespaces/app-bff-staging/services/app-bff-service","uid":"5f2f79ae-20bc-11e8-a1f9-02d01e4a667a","resourceVersion":"39561187","creationTimestamp":"2018-03-05T21:30:05Z","annotations":{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"name\":\"app-bff-service\",\"namespace\":\"app-bff-staging\"},\"spec\":{\"ports\":[{\"name\":\"backend\",\"port\":3000,\"protocol\":\"TCP\",\"targetPort\":3000}],\"selector\":{\"app\":\"app-bff\"},\"type\":\"ClusterIP\"}}\n"}},"spec":{"ports":[{"name":"backend","protocol":"TCP","port":3000,"targetPort":3000}],"selector":{"app":"app-bff"},"clusterIP":"REDUCTED","type":"ClusterIP","sessionAffinity":"None"},"status":{"loadBalancer":{}}},"port":3000,"secure":false,"secureCACert":{"secret":"","caFilename":"","pemSha":""},"sslPassthrough":false,"endpoints":[{"address":"REDUCTED","port":"3000","maxFails":0,"failTimeout":0,"target":{"kind":"Pod","namespace":"app-bff-staging","name":"app-bff-3-2881575126-jwllz","uid":"7643c83d-2cea-11e8-889c-0a91ad1abda6","resourceVersion":"42266422"}},{"address":"REDUCTED","port":"3000","maxFails":0,"failTimeout":0,"target":{"kind":"Pod","namespace":"app-bff-staging","name":"app-bff-stateless-3511195472-7p332","uid":"c4bbdd20-2cf2-11e8-889c-0a91ad1abda6","resourceVersion":"42276219"}},{"address":"REDUCTED","port":"3000","maxFails":0,"failTimeout":0,"target":{"kind":"Pod","namespace":"app-bff-staging","name":"app-bff-2-2331681453-cfwgn","uid":"65e3132c-2cea-11e8-889c-0a91ad1abda6","resourceVersion":"42266325"}},{"address":"REDUCTED","port":"3000","maxFails":0,"failTimeout":0,"target":{"kind":"Pod","namespace":"app-bff-staging","name":"app-bff-1-1759178922-bv2m0","uid":"c5898379-2cf2-11e8-889c-0a91ad1abda6","resourceVersion":"42276416"}}],"sessionAffinityConfig":{"name":"","cookieSessionAffinity":{"name":"","hash":""}}},{"name":"upstream-default-backend","service":{"metadata":{"name":"nginx-default-backend","namespace":"nginx","selfLink":"/api/v1/namespaces/nginx/services/nginx-default-backend","uid":"462f4e80-81ab-11e7-9380-0a36945459aa","resourceVersion":"42292594","creationTimestamp":"2017-08-15T11:17:07Z","labels":{"k8s-addon":"ingress-nginx.addons.k8s.io"},"annotations":{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"labels\":{\"k8s-addon\":\"ingress-nginx.addons.k8s.io\"},\"name\":\"nginx-default-backend\",\"namespace\":\"nginx\"},\"spec\":{\"ports\":[{\"port\":80,\"targetPort\":\"http\"}],\"selector\":{\"app\":\"nginx-default-backend\"}}}\n"}},"spec":{"ports":[{"protocol":"TCP","port":80,"targetPort":"http"}],"selector":{"app":"nginx-default-backend"},"clusterIP":"REDUCTED","type":"ClusterIP","sessionAffinity":"None"},"status":{"loadBalancer":{}}},"port":0,"secure":false,"secureCACert":{"secret":"","caFilename":"","pemSha":""},"sslPassthrough":false,"endpoints":[{"address":"REDUCTED","port":"8080","maxFails":0,"failTimeout":0,"target":{"kind":"Pod","namespace":"nginx","name":"nginx-default-backend-3184178138-hj7kt","uid":"b593b707-c2e3-11e7-95a9-06760f4b832a","resourceVersion":"21728925"}}],"sessionAffinityConfig":{"name":"","cookieSessionAffinity":{"name":"","hash":""}}}, REDUCTED_A_LOT_OF_OTHER_SERVICES]
2018/03/21 20:06:45 [warn] 38#38: *5 a client request body is buffered to a temporary file /var/lib/nginx/body/0000000001, client: ::1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
 - [] - - [21/Mar/2018:20:06:45 +0000] "POST /configuration/backends HTTP/1.1" 201 5 "-" "Go-http-client/1.1" 12906 0.000 [-] - - - -
I0321 20:06:45.700599       7 controller.go:202] dynamic reconfiguration succeeded
I0321 20:06:46.606372       7 controller.go:183] backend reload required
I0321 20:06:46.606441       7 util.go:64] system fs.file-max=2097152
I0321 20:06:46.606449       7 nginx.go:560] maximum number of open file descriptors : 523264
I0321 20:06:46.811802       7 nginx.go:658] NGINX configuration diff
I0321 20:06:46.811823       7 nginx.go:659] --- /etc/nginx/nginx.conf   2018-03-21 20:06:44.633528996 +0000
...
I0321 20:06:46.958516       7 controller.go:192] ingress backend successfully reloaded...
2018/03/21 20:06:52 [error] 176#176: *51 failed to run balancer_by_lua*: /etc/nginx/lua/balancer.lua:31: attempt to index local 'backend' (a nil value)
stack traceback:
    /etc/nginx/lua/balancer.lua:31: in function 'balance'
    /etc/nginx/lua/balancer.lua:97: in function 'call'
    balancer_by_lua:2: in function <balancer_by_lua:1> while connecting to upstream, client: REDUCTED, server: REDUCTED, request: "GET ping HTTP/1.1", host: "REDUCTED"
ElvinEfendi commented 6 years ago

Thanks for the logs @mikksoone!

2018/03/21 20:06:45 [warn] 38#38: *5 a client request body is buffered to a temporary file /var/lib/nginx/body/0000000001, client: ::1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
 - [] - - [21/Mar/2018:20:06:45 +0000] "POST /configuration/backends HTTP/1.1" 201 5 "-" "Go-http-client/1.1" 12906 0.000 [-] - - - -

this stands out in the logs. I see that client_body_buffer_size is 8k whereas client_max_body_size is 1m for you vhosts. Based on the log entry above looks like Nginx writes the data into a temporary file(because the data being sent is over 8k) and therefore ngx.req.get_body_data() returns empty data. In order to confirm this theory can you bump up client_body_buffer_size to 1m for the application under consideration using https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/annotations.md#client-body-buffer-size and test the feature again? Also can you provide us some stats about your cluster, i.e how many services the ingress controller front-ending and approximately how many replicas/pods?

Once this is confirmed we can discuss possible solutions. There are definitely a lot of redundant data we send to Lua, so we can eliminate a lot on controller part before POSTing the data. And on top of this in Lua land we can also make sure we read the request body from file when need to be.

aledbf commented 6 years ago

@ElvinEfendi what if we start adding items here https://github.com/kubernetes/ingress-nginx/issues/2231 so we know what it's missing?

mikksoone commented 6 years ago

Yep, client_body_buffer_size: 1m fixes the issue. Cluster has around 10 services, 30 pods.

PS! Big thanks for the feature, our keepalives are now very happy.

rikatz commented 6 years ago

Using the image quay.io/aledbf/nginx-ingress-controller:0.346 and still have the problem here. I've an environment with some non existing services/backend (users misconfiguration) and also some non existing secrets.

Mar 21 17:55:22 ingress1 docker[14156]: 2018/03/21 20:55:22 [error] 7119#7119: *2766 failed to run balancer_by_lua*: /etc/nginx/lua/balancer.lua:31: attempt to index local 'backend' (a nil value)
Mar 21 17:55:22 ingress1 docker[14156]: stack traceback:
Mar 21 17:55:22 ingress1 docker[14156]:         /etc/nginx/lua/balancer.lua:31: in function 'balance'
Mar 21 17:55:22 ingress1 docker[14156]:         /etc/nginx/lua/balancer.lua:97: in function 'call'
Mar 21 17:55:22 ingress1 docker[14156]:         balancer_by_lua:2: in function <balancer_by_lua:1> while connecting to upstream, client: 10.10.10.10, server: system1.lab.local, request: "GET / HTTP/2.0", host: "system1.lab.local"
Mar 21 17:55:22 ingress1 docker[14156]: 10.10.10.10 - [10.10.10.10] - - [21/Mar/2018:20:55:22 +0000] "GET / HTTP/2.0" 500 186 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0" 332 0.000 [default-nginx-8080]  0 - -
Mar 21 17:55:22 ingress1 docker[14156]: 2018/03/21 20:55:22 [error] 7119#7119: *2766 shm_add_upstream::peer failed while logging request, client: 10.10.10.10, server: system1.lab.local, request: "GET / HTTP/2.0", host: "system1.lab.local"
Mar 21 17:55:22 ingress1 docker[14156]: 2018/03/21 20:55:22 [error] 7119#7119: *2766 handler::shm_add_upstream() failed while logging request, client: 10.10.10.10, server: system1.lab.local, request: "GET / HTTP/2.0", host: "system1.lab.local"
ElvinEfendi commented 6 years ago

@rikatz please provide more logs based on https://github.com/kubernetes/ingress-nginx/issues/2225#issuecomment-374953979 and https://github.com/kubernetes/ingress-nginx/issues/2225#issuecomment-374966071

aledbf commented 6 years ago

Closing. Please update to 0.13.0

rikatz commented 6 years ago

@aledbf @ElvinEfendi I was able to take a look into the posted JSON here, and it's somehow related also to buffer size as mentioned here

The thing is that we have more than 700 Ingress Objects (and secrets, and services, and etc etc) and I don't think 10m is enough in this case.

In PR #2309 there's a mention about # this should be equals to configuration_data dict, as I've been pretty far from Ingress development from a time to now, let me ask you, what is the configuration_data dict, and is this configurable? :)

I'll try anyway to change this buffer to something huge like 100m (hope this is enough) directly in nginx.tmpl and at least check if the dynamic reconfiguration works now.

rikatz commented 6 years ago

Setting this to 100m is not enough, although the whole JSON have 289k by now.

After the JSON printed in the Log, there are the following log lines:

Apr 17 08:59:52 ingress.lab docker[26907]: 2018/04/17 11:59:52 [warn] 2825#2825: *2539 a client request body is buffered to a temporary file /var/lib/nginx/body/0000000060, client: 127.0.0.1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
Apr 17 08:59:52 ingress.lab docker[26907]: I0417 11:59:52.462930       8 controller.go:195] dynamic reconfiguration succeeded
Apr 17 08:59:52 ingress.lab docker[26907]: 127.0.0.1 - [127.0.0.1] - - [17/Apr/2018:11:59:52 +0000] "POST /configuration/backends HTTP/1.1" 201 5 "-" "Go-http-client/1.1" 296065 0.003 [/-] - - - -
Apr 17 08:59:54 ingress.lab docker[26907]: E0417 11:59:54.574679       8 backend_ssl.go:145] unexpected error generating SSL certificate with full intermediate chain CA certs: Invalid certificate.
Apr 17 08:59:54 ingress.lab docker[26907]: E0417 11:59:54.574769       8 backend_ssl.go:145] unexpected error generating SSL certificate with full intermediate chain CA certs: Invalid certificate.
Apr 17 08:59:54 ingress.lab docker[26907]: E0417 11:59:54.574922       8 backend_ssl.go:145] unexpected error generating SSL certificate with full intermediate chain CA certs: Invalid certificate.

Are these SSL errors somehow related to dynamic reconfiguration?

aledbf commented 6 years ago

Are these SSL errors somehow related to dynamic reconfiguration?

No

ElvinEfendi commented 6 years ago

what is the configuration_data dict, and is this configurable?

@rikatz configuration_data is a shared Lua dictionary where we keep the list of backends with their properties(endpoints and etc.)

I'll try anyway to change this buffer to something huge like 100m

I don't think this is configurable for that specific /configuration endpoint.

Apr 17 08:59:52 ingress.lab docker[26907]: 2018/04/17 11:59:52 [warn] 2825#2825: *2539 a client request body is buffered to a temporary file /var/lib/nginx/body/0000000060, client: 127.0.0.1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"

this tells me 10m is not enough.


We are working on a more general solution for this to make dynamic configuration to work even when Nginx can not buffer the whole payload in memory: https://github.com/Shopify/ingress/pull/44

We will make upstream PR soon. In the meantime can you start ingress-nginx with v=2, and copy the JSON payload it POSTs to Lua endpoint and measure its size? I'm curious how big it is.

rikatz commented 6 years ago

@ElvinEfendi I did this and sent to @aledbf . Its size is 300k.

Edit: The payload contains sensitive data, so I cannot publish it here.

ahmettahasakar commented 6 years ago

I keep getting the following error:

unexpected error generating SSL certificate with full intermediate chain CA certs: Invalid certificate.

The certificate seems perfectly fine when verified with ssllabs.com. What is the reason for invalid certificate warnings? If I can not fix it, is there a way to disable them?

screen shot 2018-04-26 at 19 44 24

csabakollar commented 6 years ago

Having the same problem as @ahmettahasakar E0427 09:24:52.887180 9 backend_ssl.go:146] unexpected error generating SSL certificate with full intermediate chain CA certs: Invalid certificate.

ahmettahasakar commented 6 years ago

I figured it out. Some certificates don't support it apperantly. You need to set --enable-ssl-chain-completion = false . Then it stops

csabakollar commented 6 years ago

Thanks @ahmettahasakar

ElvinEfendi commented 6 years ago

@ahmettahasakar @csabakollar can you give some more details about how these all related to --enable-dynamic-configuration?

ahmettahasakar commented 6 years ago

It isn't. I saw @rikatz 's message and wrote my solution in case someone needs it.

csabakollar commented 6 years ago

google took me here and I saw @ahmettahasakar is having the same problem... pure coincidence :)

ElvinEfendi commented 6 years ago

@rikatz could you test quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.14.0 and let us know if you're still seeing this issue.

rikatz commented 6 years ago

@ElvinEfendi ok, will try next week :)

rikatz commented 6 years ago

@ElvinEfendi Now that's working :) Tested with 0.14.0

Thanks guys for the great job. Will also test the resty-waf :D