kubernetes / ingress-nginx

Ingress-NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.22k stars 8.2k forks source link

Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io" #5401

Closed aduncmj closed 2 years ago

aduncmj commented 4 years ago

Hi all,

When I apply the ingress's configuration file named ingress-myapp.yaml by command kubectl apply -f ingress-myapp.yaml, there was an error. The complete error is as follows:

Error from server (InternalError): error when creating "ingress-myapp.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s: context deadline exceeded

This is my ingress:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ingress-myapp
  namespace: default
  annotations: 
    kubernetes.io/ingress.class: "nginx"
spec:
  rules: 
  - host: myapp.magedu.com
    http:
      paths:
      - path: 
        backend: 
          serviceName: myapp
          servicePort: 80

Has anyone encountered this problem?

rzuku commented 3 years ago

Ok. It occured it needs port 8443. Thank you.

thien-hoang commented 3 years ago

Solution: delete your ValidatingWebhookConfiguration

kubectl get -A ValidatingWebhookConfiguration NAME nginx-ingress-ingress-nginx-admission

kubectl delete -A ValidatingWebhookConfiguration nginx-ingress-ingress-nginx-admission

This also worked for me 👍🏻

rizwanzaheer commented 3 years ago

Solution: delete your ValidatingWebhookConfiguration

kubectl get -A ValidatingWebhookConfiguration NAME nginx-ingress-ingress-nginx-admission

kubectl delete -A ValidatingWebhookConfiguration nginx-ingress-ingress-nginx-admission

That also works for me

kazysgurskas commented 3 years ago

Solution: delete your ValidatingWebhookConfiguration kubectl get -A ValidatingWebhookConfiguration NAME nginx-ingress-ingress-nginx-admission kubectl delete -A ValidatingWebhookConfiguration nginx-ingress-ingress-nginx-admission

That also works for me

That's not a solution, that's destroying the functionality of the software that caused the root issue :)

As mentioned previously, the solution is to allow admission webhook port 8443 from master to worker nodes. On private GKE clusters firewall rule should be gke-<cluster_name>-<id>-master with target tags gke-<cluster_name>-<id>-node, source range - your master CIDR block and TCP ports 10250, 443 by default.

RobbieMcKinstry commented 3 years ago

Is this the officially supported Kubernetes ingress controller, a CNCF-owned IBM/RedHat/Google/Microsoft-funded project, or is this unmaintained? This issue breaks the "Hello World" ingress tutorial on the Kubernetes website, and the maintainers have closed this issue and refused to reopen it. While I understand that in open source, nobody owns me anything when they're volunteering their time, in this case they're not volunteering their time. This is a well-funded project with absent maintainers. It's quite unprofessional to have a tutorial on the website fail, and then close the issue addressing the problem.

aledbf commented 3 years ago

Is this the officially supported Kubernetes ingress controller,

Yes

a CNCF-owned IBM/RedHat/Google/Microsoft-funded project,

No

or is this unmaintained?

No

This issue breaks the "Hello World" ingress tutorial on the Kubernetes website, and the maintainers have closed this issue and refused to reopen it.

This is not true. If you check the thread, the issue is not related to ingress-nginx itself, but a networking issue; the master node cannot connect to the worker node/s, like the previous comment mention.

While I understand that in open source, nobody owns me anything when they're volunteering their time, in this case, they're not volunteering their time.

Yes, I am volunteering my time since I created ingress-nginx

This is a well-funded project with absent maintainers.

This is not true. I've been unable to find sponsors for my time on the project.

It's quite unprofessional to have a tutorial on the website fail and then close the issue addressing the problem.

Not sure exactly what you are doing. From https://kind.sigs.k8s.io/docs/user/ingress/

cat <<EOF | kind create cluster --config=-
> kind: Cluster
> apiVersion: kind.x-k8s.io/v1alpha4
> nodes:
> - role: control-plane
>   kubeadmConfigPatches:
>   - |
>     kind: InitConfiguration
>     nodeRegistration:
>       kubeletExtraArgs:
>         node-labels: "ingress-ready=true"
>   extraPortMappings:
>   - containerPort: 80
>     hostPort: 80
>     protocol: TCP
>   - containerPort: 443
>     hostPort: 443
>     protocol: TCP
> EOF
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.19.1) 🖼
 ✓ Preparing nodes 📦  
 ✓ Writing configuration 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Thanks for using kind! 😊
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/provider/kind/deploy.yaml
namespace/ingress-nginx created
serviceaccount/ingress-nginx created
configmap/ingress-nginx-controller created
clusterrole.rbac.authorization.k8s.io/ingress-nginx created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created
role.rbac.authorization.k8s.io/ingress-nginx created
rolebinding.rbac.authorization.k8s.io/ingress-nginx created
service/ingress-nginx-controller-admission created
service/ingress-nginx-controller created
deployment.apps/ingress-nginx-controller created
validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
serviceaccount/ingress-nginx-admission created
clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
role.rbac.authorization.k8s.io/ingress-nginx-admission created
rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
job.batch/ingress-nginx-admission-create created
job.batch/ingress-nginx-admission-patch created
kubectl wait --namespace ingress-nginx \
>   --for=condition=ready pod \
>   --selector=app.kubernetes.io/component=controller \
>   --timeout=90s
kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/usage.yaml
pod/ingress-nginx-controller-6df69bd4f7-fv7lr condition met
kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/usage.yaml
pod/foo-app created
service/foo-service created
pod/bar-app created
service/bar-service created
ingress.networking.k8s.io/example-ingress created
# should output "foo"
curl localhost/foo
foo
# should output "bar"
curl localhost/bar
bar

@RobbieMcKinstry not sure how you arrived at all those assumptions about the project. Can you share the source for that?

RobbieMcKinstry commented 3 years ago

@aledbf Multiple people in this thread reported having the same problem outside of the "networking issue" described above, myself included. Additionally, we've made clear that disabling the validating webhook is not a solution.

We've asked that you reopen the issue because those problems are not addressed by the proposed solution. The root of my "unprofessionalism" claim is that you've been explicitly asked to reopen this issue in August, and haven't replied in over three months or made any effort to resolve the users' problems. The first step to ameliorate this is to reopen the issue.

I empathize with the difficulty of finding a maintainer and running an OSS project. OSS work is hard to keep up with and there are few volunteers. However, there's no reason that an extremely well funded project like Kubernetes should have an official ingress controller with an uncertain maintainership status.

If the load is too much for one person to bear (and no one else is willing to step forward), perhaps the right move for the user is to downgrade this ingress controller to unofficial status. it's a really unfortunate user experience to go through the official ingress tutorial on a fresh cluster, hit a bug, and wait three months for a response with a ton of other people having the same problem. By that point, I suspect many users have abandoned this controller is favor of another anyway.

aledbf commented 3 years ago

The first step to ameliorate this is to reopen the issue.

done.

aledbf commented 3 years ago

However, there's no reason that an extremely well funded project like Kubernetes should have an official ingress controller with an uncertain maintainership status.

Again, not sure why you have such an assumption

aledbf commented 3 years ago

If the load is too much for one person to bear (and no one else is willing to step forward), perhaps the right move for the user is to downgrade this ingress controller to unofficial status.

Maybe that is the way. How you proposed to do that?

aledbf commented 3 years ago

Multiple people in this thread reported having the same problem outside of the "networking issue" described above, myself included. Additionally, we've made clear that disabling the validating webhook is not a solution.

There is no single comment in this thread, like the one I posted showing this is not an ingress-nginx problem, or how to reproduce this step by step (including the cluster creation)

Edit: the use of kind as provisioned is intentional, to use documentation writing by a different project. And yes, it is a single node deployment, to show this is a firewall/networking problem.

aledbf commented 3 years ago

Reading my own comments, it sounds like I have no interest in this issue. I could edit my comments, or I can show that I cannot reproduce it:

export PROJECT_ID=XXXXXXX
export ZONE=us-west1-a
gcloud config set compute/zone $ZONE
gcloud beta container clusters create "${PROJECT_ID}" \
  --machine-type=n1-standard-1 \
  --zone=us-west1-a \
  --preemptible \
  --num-nodes=3 \
  --no-enable-basic-auth

From https://kubernetes.github.io/ingress-nginx/deploy/#gce-gke

kubectl create clusterrolebinding cluster-admin-binding \
  --clusterrole cluster-admin \
  --user $(gcloud config get-value account)

Create the firewall rules (if required) https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters#add_firewall_rules

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.41.2/deploy/static/provider/cloud/deploy.yaml
namespace/ingress-nginx created
serviceaccount/ingress-nginx created
configmap/ingress-nginx-controller created
clusterrole.rbac.authorization.k8s.io/ingress-nginx created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created
role.rbac.authorization.k8s.io/ingress-nginx created
rolebinding.rbac.authorization.k8s.io/ingress-nginx created
service/ingress-nginx-controller-admission created
service/ingress-nginx-controller created
deployment.apps/ingress-nginx-controller created
validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
serviceaccount/ingress-nginx-admission created
clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
role.rbac.authorization.k8s.io/ingress-nginx-admission created
rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
job.batch/ingress-nginx-admission-create created
job.batch/ingress-nginx-admission-patch created
kubectl wait --namespace ingress-nginx \
>    --for=condition=ready pod \
>    --selector=app.kubernetes.io/component=controller \
>    --timeout=90s
pod/ingress-nginx-controller-67759f896-9bvv5 condition met
kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/usage.yaml
pod/foo-app created
service/foo-service created
pod/bar-app created
service/bar-service created
ingress.networking.k8s.io/example-ingress created
kubectl get pods -A
NAMESPACE       NAME                                                          READY   STATUS      RESTARTS   AGE
default         bar-app                                                       1/1     Running     0          6s
default         foo-app                                                       1/1     Running     0          7s
ingress-nginx   ingress-nginx-admission-create-vx77x                          0/1     Completed   0          42s
ingress-nginx   ingress-nginx-admission-patch-5vv9r                           0/1     Completed   0          42s
ingress-nginx   ingress-nginx-controller-67759f896-9bvv5                      1/1     Running     0          45s
kube-system     event-exporter-gke-77cccd97c6-vtlrm                           2/2     Running     0          4m59s
kube-system     fluentd-gke-d9jxm                                             2/2     Running     0          2m39s
kube-system     fluentd-gke-dhplz                                             2/2     Running     0          3m13s
kube-system     fluentd-gke-scaler-54796dcbf7-hwls9                           1/1     Running     0          4m56s
kube-system     fluentd-gke-tlj2n                                             2/2     Running     0          2m6s
kube-system     gke-metrics-agent-q94w2                                       1/1     Running     0          4m52s
kube-system     gke-metrics-agent-rjzts                                       1/1     Running     0          4m43s
kube-system     gke-metrics-agent-sts8c                                       1/1     Running     0          4m42s
kube-system     kube-dns-7bb4975665-dzlvw                                     4/4     Running     0          4m59s
kube-system     kube-dns-7bb4975665-lrjv7                                     4/4     Running     0          4m28s
kube-system     kube-dns-autoscaler-645f7d66cf-bqnfm                          1/1     Running     0          4m54s
kube-system     kube-proxy-gke-ingress-nginx-k8s-default-pool-4aea3aa5-0lbc   1/1     Running     0          4m52s
kube-system     kube-proxy-gke-ingress-nginx-k8s-default-pool-4aea3aa5-42pm   1/1     Running     0          4m42s
kube-system     kube-proxy-gke-ingress-nginx-k8s-default-pool-4aea3aa5-jcxc   1/1     Running     0          4m43s
kube-system     l7-default-backend-678889f899-dbbch                           1/1     Running     0          5m
kube-system     metrics-server-v0.3.6-64655c969-xrt9h                         2/2     Running     0          4m27s
kube-system     prometheus-to-sd-h8zxv                                        1/1     Running     0          4m42s
kube-system     prometheus-to-sd-j64s6                                        1/1     Running     0          4m43s
kube-system     prometheus-to-sd-jh495                                        1/1     Running     0          4m52s
kube-system     stackdriver-metadata-agent-cluster-level-565b88964d-sdmh4     2/2     Running     1          4m6s
sleep 60
kubectl get ing -A
NAMESPACE   NAME              HOSTS   ADDRESS         PORTS   AGE
default     example-ingress   *       34.83.147.123   80      63s
curl 34.83.147.123
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

(expected, only /foo and /bar are mapped)

curl 34.83.147.123/bar
bar
curl 34.83.147.123/foo
foo
kubectl logs -f -n ingress-nginx   ingress-nginx-controller-67759f896-9bvv5 
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v0.41.2
  Build:         d8a93551e6e5798fc4af3eb910cef62ecddc8938
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.19.4

-------------------------------------------------------------------------------

I1128 16:04:29.108727       6 flags.go:205] "Watching for Ingress" class="nginx"
W1128 16:04:29.111652       6 flags.go:210] Ingresses with an empty class will also be processed by this Ingress controller
W1128 16:04:29.111970       6 client_config.go:608] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1128 16:04:29.112125       6 main.go:241] "Creating API client" host="https://10.111.240.1:443"
I1128 16:04:29.123281       6 main.go:285] "Running in Kubernetes cluster" major="1" minor="16+" git="v1.16.15-gke.4300" state="clean" commit="7ed5ddc0e67cb68296994f0b754cec45450d6a64" platform="linux/amd64"
I1128 16:04:29.366373       6 main.go:105] "SSL fake certificate created" file="/etc/ingress-controller/ssl/default-fake-certificate.pem"
I1128 16:04:29.380812       6 ssl.go:528] "loading tls certificate" path="/usr/local/certificates/cert" key="/usr/local/certificates/key"
I1128 16:04:29.416486       6 nginx.go:249] "Starting NGINX Ingress controller"
I1128 16:04:29.438180       6 event.go:282] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"ingress-nginx-controller", UID:"75861588-4018-40d0-8363-0e207b2195e2", APIVersion:"v1", ResourceVersion:"1840", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/ingress-nginx-controller
I1128 16:04:30.617176       6 nginx.go:291] "Starting NGINX process"
I1128 16:04:30.617468       6 leaderelection.go:243] attempting to acquire leader lease  ingress-nginx/ingress-controller-leader-nginx...
I1128 16:04:30.617795       6 nginx.go:311] "Starting validation webhook" address=":8443" certPath="/usr/local/certificates/cert" keyPath="/usr/local/certificates/key"
I1128 16:04:30.618051       6 controller.go:144] "Configuration changes detected, backend reload required"
I1128 16:04:30.633898       6 leaderelection.go:253] successfully acquired lease ingress-nginx/ingress-controller-leader-nginx
I1128 16:04:30.634382       6 status.go:84] "New leader elected" identity="ingress-nginx-controller-67759f896-9bvv5"
I1128 16:04:30.716550       6 controller.go:161] "Backend successfully reloaded"
I1128 16:04:30.716839       6 controller.go:172] "Initial sync, sleeping for 1 second"
I1128 16:04:30.717251       6 event.go:282] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress-nginx", Name:"ingress-nginx-controller-67759f896-9bvv5", UID:"69306a33-62bd-4619-9243-5cd19ad99eee", APIVersion:"v1", ResourceVersion:"1883", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
W1128 16:04:50.856834       6 controller.go:950] Service "default/foo-service" does not have any active Endpoint.
W1128 16:04:50.856867       6 controller.go:950] Service "default/bar-service" does not have any active Endpoint.
I1128 16:04:50.921678       6 main.go:112] "successfully validated configuration, accepting" ingress="example-ingress/default"
I1128 16:04:50.929275       6 event.go:282] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"example-ingress", UID:"3c3c4a56-c154-41d5-8fba-020025a7bdd5", APIVersion:"networking.k8s.io/v1beta1", ResourceVersion:"2110", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
W1128 16:04:50.936565       6 controller.go:950] Service "default/foo-service" does not have any active Endpoint.
W1128 16:04:50.936732       6 controller.go:950] Service "default/bar-service" does not have any active Endpoint.
I1128 16:04:50.998927       6 main.go:112] "successfully validated configuration, accepting" ingress="example-ingress/default"
I1128 16:04:51.003376       6 event.go:282] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"example-ingress", UID:"3c3c4a56-c154-41d5-8fba-020025a7bdd5", APIVersion:"networking.k8s.io/v1beta1", ResourceVersion:"2112", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
I1128 16:04:54.117314       6 controller.go:144] "Configuration changes detected, backend reload required"
I1128 16:04:54.245180       6 controller.go:161] "Backend successfully reloaded"
I1128 16:04:54.246042       6 event.go:282] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress-nginx", Name:"ingress-nginx-controller-67759f896-9bvv5", UID:"69306a33-62bd-4619-9243-5cd19ad99eee", APIVersion:"v1", ResourceVersion:"1883", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
I1128 16:05:30.639982       6 status.go:290] "updating Ingress status" namespace="default" ingress="example-ingress" currentValue=[] newValue=[{IP:34.83.147.123 Hostname:}]
I1128 16:05:30.648706       6 event.go:282] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"example-ingress", UID:"3c3c4a56-c154-41d5-8fba-020025a7bdd5", APIVersion:"networking.k8s.io/v1beta1", ResourceVersion:"2288", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
200.83.32.243 - - [28/Nov/2020:16:06:24 +0000] "GET /foo HTTP/1.1" 200 4 "-" "curl/7.68.0" 80 0.003 [default-foo-service-5678] [] 10.108.2.7:5678 4 0.002 200 2b71f5cf7f0477962f8e8cc3f2ff086d
200.83.32.243 - - [28/Nov/2020:16:06:28 +0000] "GET /bar HTTP/1.1" 200 4 "-" "curl/7.68.0" 80 0.002 [default-bar-service-5678] [] 10.108.2.8:5678 4 0.002 200 4233f276c73a7ceaf3521ad974c907ef
200.83.32.243 - - [28/Nov/2020:16:08:51 +0000] "GET /bar HTTP/1.1" 200 4 "-" "curl/7.68.0" 80 0.002 [default-bar-service-5678] [] 10.108.2.8:5678 4 0.002 200 aaa1fa13d768101fd39c463645929320
200.83.32.243 - - [28/Nov/2020:16:08:55 +0000] "GET /foo HTTP/1.1" 200 4 "-" "curl/7.68.0" 80 0.002 [default-foo-service-5678] [] 10.108.2.7:5678 4 0.002 200 5b66251e285ec799bb0ae4ad2217c8a4

From the log:

I1128 16:04:50.921678       6 main.go:112] "successfully validated configuration, accepting" ingress="example-ingress/default"

that means the API server reached the webhook validation running in the ingress-nginx pod

rkevin-arch commented 3 years ago

Hi! Thanks again for creating and maintaining this project. Whether this is a legit issue or not, passive aggressiveness is never a solution.

I do have the same issue on a baremetal cluster bootstrapped by kubeadm and using Calico as the CNI. There is no firewall between any of the nodes, so they should be able to freely talk to each other. It might be possible that kubeadm's default settings has some firewall rules that causes this issue. UPDATE: I had a different issue altogether. Please ignore.

I found a way to recreate the problem using minikube (which I recognize is experimental with multinode setups, but this might help digging deeper into the issue):

minikube start -n=2
helm install -n default ingress ingress-nginx/ingress-nginx
kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/usage.yaml
rkevin@redshift:~$ minikube start -n=2
😄  minikube v1.15.1 on Arch rolling
✨  Automatically selected the virtualbox driver
👍  Starting control plane node minikube in cluster minikube
🔥  Creating virtualbox VM (CPUs=2, Memory=2200MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.19.4 on Docker 19.03.13 ...
🔎  Verifying Kubernetes components...
🌟  Enabled addons: storage-provisioner, default-storageclass

❗  Multi-node clusters are currently experimental and might exhibit unintended behavior.
📘  To track progress on multi-node clusters, see https://github.com/kubernetes/minikube/issues/7538.

👍  Starting node minikube-m02 in cluster minikube
🔥  Creating virtualbox VM (CPUs=2, Memory=2200MB, Disk=20000MB) ...
🌐  Found network options:
    ▪ NO_PROXY=192.168.99.144
🐳  Preparing Kubernetes v1.19.4 on Docker 19.03.13 ...
    ▪ env NO_PROXY=192.168.99.144
🔎  Verifying Kubernetes components...
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
rkevin@redshift:~$ helm install -n default ingress ingress-nginx/ingress-nginx
NAME: ingress
LAST DEPLOYED: Sat Nov 28 14:12:11 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status by running 'kubectl --namespace default get services -o wide -w ingress-ingress-nginx-controller'

An example Ingress that makes use of the controller:

  apiVersion: networking.k8s.io/v1beta1
  kind: Ingress
  metadata:
    annotations:
      kubernetes.io/ingress.class: nginx
    name: example
    namespace: foo
  spec:
    rules:
      - host: www.example.com
        http:
          paths:
            - backend:
                serviceName: exampleService
                servicePort: 80
              path: /
    # This section is only required if TLS is to be enabled for the Ingress
    tls:
        - hosts:
            - www.example.com
          secretName: example-tls

If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:

  apiVersion: v1
  kind: Secret
  metadata:
    name: example-tls
    namespace: foo
  data:
    tls.crt: <base64 encoded cert>
    tls.key: <base64 encoded key>
  type: kubernetes.io/tls
rkevin@redshift:~$ kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/usage.yaml
pod/foo-app created
service/foo-service created
pod/bar-app created
service/bar-service created
Warning: networking.k8s.io/v1beta1 Ingress is deprecated in v1.19+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
Error from server (InternalError): error when creating "https://kind.sigs.k8s.io/examples/ingress/usage.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post "https://ingress-ingress-nginx-controller-admission.default.svc:443/networking/v1beta1/ingresses?timeout=10s": dial tcp 10.96.214.134:443: connect: connection refused

Interestingly enough, this is not a problem with kind. I got it to work with the following:

kind create cluster --config - <<EOF
> kind: Cluster
> apiVersion: kind.x-k8s.io/v1alpha4
> nodes:
>   - role: control-plane
>   - role: worker
> EOF
helm install -n default ingress ingress-nginx/ingress-nginx
kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/usage.yaml

I can make a vagrant + kubeadm setup and see if that recreates this problem if you want. The baremetal cluster we have is fairly vanilla, so I can't think of a reason for it to fail if firewall rules are the culprit. UPDATE: I had a different issue altogether. Please ignore.

aledbf commented 3 years ago

Error from server (InternalError): error when creating "https://kind.sigs.k8s.io/examples/ingress/usage.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post "https://ingress-ingress-nginx-controller-admission.default.svc:443/networking/v1beta1/ingresses?timeout=10s": dial tcp 10.96.214.134:443: connect: connection refused

Did you execute that command just after the helm install? The command

kubectl wait --namespace ingress-nginx \
    --for=condition=ready pod \
    --selector=app.kubernetes.io/component=controller \
    --timeout=90s

waits for the creation of the SSL certificate used in the validation webhook (usually takes ~60s). Only after the secret is created, the ingress controller can start

aledbf commented 3 years ago

@rkevin-arch I can reproduce the minikube issue, but seems related to the default CNI selected (kindnet)? Please check again using flannel

minikube start -n=2 --driver=kvm2 --cni=flannel
😄  minikube v1.15.1 on Debian bullseye/sid
✨  Using the kvm2 driver based on user configuration
👍  Starting control plane node minikube in cluster minikube
🔥  Creating kvm2 VM (CPUs=2, Memory=3950MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.19.4 on Docker 19.03.13 ...
🔗  Configuring Flannel (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
🌟  Enabled addons: storage-provisioner, default-storageclass

❗  Multi-node clusters are currently experimental and might exhibit unintended behavior.
📘  To track progress on multi-node clusters, see https://github.com/kubernetes/minikube/issues/7538.

👍  Starting node minikube-m02 in cluster minikube
🔥  Creating kvm2 VM (CPUs=2, Memory=3950MB, Disk=20000MB) ...
🌐  Found network options:
    ▪ NO_PROXY=192.168.39.15
🐳  Preparing Kubernetes v1.19.4 on Docker 19.03.13 ...
    ▪ env NO_PROXY=192.168.39.15
🔎  Verifying Kubernetes components...
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
helm install -n default ingress ingress-nginx/ingress-nginx
NAME: ingress
LAST DEPLOYED: Sat Nov 28 20:49:11 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status by running 'kubectl --namespace default get services -o wide -w ingress-ingress-nginx-controller'

An example Ingress that makes use of the controller:

  apiVersion: networking.k8s.io/v1beta1
  kind: Ingress
  metadata:
    annotations:
      kubernetes.io/ingress.class: nginx
    name: example
    namespace: foo
  spec:
    rules:
      - host: www.example.com
        http:
          paths:
            - backend:
                serviceName: exampleService
                servicePort: 80
              path: /
    # This section is only required if TLS is to be enabled for the Ingress
    tls:
        - hosts:
            - www.example.com
          secretName: example-tls

If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:

  apiVersion: v1
  kind: Secret
  metadata:
    name: example-tls
    namespace: foo
  data:
    tls.crt: <base64 encoded cert>
    tls.key: <base64 encoded key>
  type: kubernetes.io/tls
kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/usage.yaml
pod/foo-app created
service/foo-service created
pod/bar-app created
service/bar-service created
ingress.networking.k8s.io/example-ingress created
kubectl get pods -A -o wide
NAMESPACE     NAME                                                READY   STATUS    RESTARTS   AGE     IP               NODE           NOMINATED NODE   READINESS GATES
default       bar-app                                             1/1     Running   0          2m8s    10.244.1.4       minikube-m02   <none>           <none>
default       foo-app                                             1/1     Running   0          2m8s    10.244.1.5       minikube-m02   <none>           <none>
default       ingress-ingress-nginx-controller-8488fbdf45-czvvx   1/1     Running   0          2m51s   10.244.1.2       minikube-m02   <none>           <none>
kube-system   coredns-f9fd979d6-jrmzj                             1/1     Running   0          3m38s   10.88.0.2        minikube       <none>           <none>
kube-system   etcd-minikube                                       1/1     Running   0          3m46s   192.168.39.15    minikube       <none>           <none>
kube-system   kube-apiserver-minikube                             1/1     Running   0          3m46s   192.168.39.15    minikube       <none>           <none>
kube-system   kube-controller-manager-minikube                    1/1     Running   0          3m46s   192.168.39.15    minikube       <none>           <none>
kube-system   kube-flannel-ds-amd64-fzzrb                         1/1     Running   0          3m10s   192.168.39.237   minikube-m02   <none>           <none>
kube-system   kube-flannel-ds-amd64-jb9fg                         1/1     Running   0          3m38s   192.168.39.15    minikube       <none>           <none>
kube-system   kube-proxy-228k4                                    1/1     Running   0          3m38s   192.168.39.15    minikube       <none>           <none>
kube-system   kube-proxy-jrbz6                                    1/1     Running   0          3m10s   192.168.39.237   minikube-m02   <none>           <none>
kube-system   kube-scheduler-minikube                             1/1     Running   0          3m46s   192.168.39.15    minikube       <none>           <none>
kube-system   storage-provisioner                                 1/1     Running   1          3m52s   192.168.39.15    minikube       <none>           <none>
aledbf commented 3 years ago

@rkevin-arch just in case I run the same test with calico

minikube start -n=2 --driver=kvm2 --cni=calico
😄  minikube v1.15.1 on Debian bullseye/sid
✨  Using the kvm2 driver based on user configuration
👍  Starting control plane node minikube in cluster minikube
🔥  Creating kvm2 VM (CPUs=2, Memory=3950MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.19.4 on Docker 19.03.13 ...
🔗  Configuring Calico (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
🌟  Enabled addons: storage-provisioner, default-storageclass

❗  Multi-node clusters are currently experimental and might exhibit unintended behavior.
📘  To track progress on multi-node clusters, see https://github.com/kubernetes/minikube/issues/7538.

👍  Starting node minikube-m02 in cluster minikube
🔥  Creating kvm2 VM (CPUs=2, Memory=3950MB, Disk=20000MB) ...
🌐  Found network options:
    ▪ NO_PROXY=192.168.39.236
🐳  Preparing Kubernetes v1.19.4 on Docker 19.03.13 ...
    ▪ env NO_PROXY=192.168.39.236
🔎  Verifying Kubernetes components...
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
helm install -n default ingress ingress-nginx/ingress-nginx
NAME: ingress
LAST DEPLOYED: Sat Nov 28 20:56:33 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status by running 'kubectl --namespace default get services -o wide -w ingress-ingress-nginx-controller'

An example Ingress that makes use of the controller:

  apiVersion: networking.k8s.io/v1beta1
  kind: Ingress
  metadata:
    annotations:
      kubernetes.io/ingress.class: nginx
    name: example
    namespace: foo
  spec:
    rules:
      - host: www.example.com
        http:
          paths:
            - backend:
                serviceName: exampleService
                servicePort: 80
              path: /
    # This section is only required if TLS is to be enabled for the Ingress
    tls:
        - hosts:
            - www.example.com
          secretName: example-tls

If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:

  apiVersion: v1
  kind: Secret
  metadata:
    name: example-tls
    namespace: foo
  data:
    tls.crt: <base64 encoded cert>
    tls.key: <base64 encoded key>
  type: kubernetes.io/tls
kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/usage.yaml
pod/foo-app created
service/foo-service created
pod/bar-app created
service/bar-service created
ingress.networking.k8s.io/example-ingress created
rkevin-arch commented 3 years ago

Hmm, can confirm minikube start -n=2 --cni=calico works. I'll take a look at using vagrant + kubeadm to spawn a cluster with Calico and see if I can replicate the issue.

rkevin-arch commented 3 years ago

Whoops, I didn't realize the issue I have was completely unrelated to this one this entire time, even if deleting the ValidatingWebhookConfiguration does solve my issue. Sorry about that. Feel free to mark the stuff I said as off-topic.

(The issue I had was Error: admission webhook "validate.nginx.ingress.kubernetes.io" denied the request: rejecting admission review because the request does not contains an Ingress resource but networking.k8s.io/v1, Resource=ingresses with name jupyterhub in namespace staging-jhub. I'll dig elsewhere for a more permanent solution.)

aledbf commented 3 years ago

Error: admission webhook "validate.nginx.ingress.kubernetes.io" denied the request: rejecting admission review because the request does not contains an Ingress resource but networking.k8s.io/v1, Resource=ingresses with name jupyterhub in namespace staging-jhub

@rkevin-arch please make sure you are using the latest version v0.41.2. There was a regression that denied validation of ingresses networking.k8s.io/v1.

vroad commented 3 years ago

I run ingress-nginx on my DIY cluster, and started seeing issue after upgrading to the latest version (3.12.0). My cluster is based on typhoon but has many modifications.

I'll try running latest ingress version with latest typhoon (1.19.4).

typhoon has nginx ingress addon, which can be installed with kubectl. I wonder if the issue is reproducable with it or not.

vroad commented 3 years ago

I don't know what rules.apiVersions value in webhooks YAML is for, but prometheus-operator uses '*' for rules.apiVersions, while ingress-nginx only has ’v1beta1’. So, this is just the issue of YAML definition?

I installed kube-prometheus-stack in cluster (which also uses admission webhooks) as well, and don't have issues with it.

https://github.com/prometheus-community/helm-charts/blob/kube-prometheus-stack-12.3.0/charts/kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/validatingWebhookConfiguration.yaml#L20 https://github.com/kubernetes/ingress-nginx/blob/ingress-nginx-3.12.0/charts/ingress-nginx/templates/admission-webhooks/validating-webhook.yaml#L21

https://stackoverflow.com/questions/61616203/nginx-ingress-controller-failed-calling-webhook/62713105#62713105

This guy says that the issue is caused by older API version, but seems like apiVersion has been updated to one without beta recently: "apiVersion: admissionregistration.k8s.io/v1". I'm on k8s 1.19.4, which is the newest released version, I believe.

Peter-Steffek commented 3 years ago

@aduncmj I found this solution https://stackoverflow.com/questions/61365202/nginx-ingress-service-ingress-nginx-controller-admission-not-found

kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission

Its not solving the problem killing the webhook - youn need the webhook for an working cluster.

dotbalo commented 3 years ago

First of all, you cannot delete ValidatingWebhookConfiguration, which is very, very not recommended.Secondly, you need to update to the latest version of ingress, and make sure that your ingress controller is not deployed on the k8s master. Finally, make sure that the ingress controller is in the Running state. Then you will not see any errors.

kazysgurskas commented 3 years ago

First of all, you cannot delete ValidatingWebhookConfiguration, which is very, very not recommended.Secondly, you need to update to the latest version of ingress, and make sure that your ingress controller is not deployed on the k8s master. Finally, make sure that the ingress controller is in the Running state. Then you will not see any errors.

That's not entirely true. It has been posted several times before that it's a networking issue. At least the context deadline exceeded error (which is the original error posted in the issue). The first reply correctly addresses this and the docs are explicit about it: https://kubernetes.github.io/ingress-nginx/deploy/:

For private clusters, you will need to either add an additional firewall rule that allows master nodes access to port 8443/tcp on worker nodes, or change the existing rule that allows access to ports 80/tcp, 443/tcp and 10254/tcp to also allow access to port 8443/tcp.
vroad commented 3 years ago

It turned out that my helm chart values was incorrect. I set hostNetwork: true, which effectively disables access to admission webhook.

To be able to use admission webhooks with hostNetwork: true, you need to open port 8443 of node as well, I guess, but I don't think that's a good idea.

If you what you need is just exposing port 80 and 443 (but not 8443), you can use port mapping instead of hostNetwork. This way admission webhooks remain only accessible inside cluster, which is better than exposing port 8443 of node.

controller:
   hostPort:
     enabled: true
   kind: Deployment
   publishService:
     enabled: false
   replicaCount: 1
   service:
     type: ClusterIP
rzuku commented 3 years ago

@vroad you do not have to open port for 8443, but anyway you need to redirect that port to another node port like it is for other services. Right? ` ports:

rzuku commented 3 years ago

Anyway on GCp the solution requires to open 8443 port from master to nodes, so this is not opened to external world.

vroad commented 3 years ago

@rzuku I use calico on my DIY cluster, which is based on typhoon, and runs on AWS.

I don't know anything about GCP, but I don't have to configure security groups (or nodeport) for port 8443 after disabling hostNetwork, because calico handles connection to the pod, and I'm not using things like calico network policies for now.

rzuku commented 3 years ago

I have looked at my AWS EKS cluster, and the outcome is: if not any others issues, than properly defined security group allowing 8443 from control plane to nodes should help.

renanrider commented 3 years ago

@aduncmj I found this solution https://stackoverflow.com/questions/61365202/nginx-ingress-service-ingress-nginx-controller-admission-not-found

kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission

This works for me, after hours searching and this resolved, thanks!

vroad commented 3 years ago

@renanrider As others already pointed out, you should resolve network issues rather than disabling webhooks. Disabling admission webhook is bad idea.

shinebayar-g commented 3 years ago

Hmm. I just hit this issue today and scratching my head for hours.. I'm on AWS EKS 1.18 cluster (not using default AWS VPC CNI, using cilium) and deployed latest version 0.41.2 (as of today) using this yaml, following this documentation.

Values I changed are:

service.beta.kubernetes.io/aws-load-balancer-ssl-cert: my-aws-cert-arn
proxy-real-ip-cidr: my-vpc-cidr

Oh I also changed service.beta.kubernetes.io/aws-load-balancer-type: elb to nlb. Because service.beta.kubernetes.io/aws-load-balancer-type: this annotation only supports nlb as a value, yet default yaml had value elb which creates Classic Load Balancer instead of Network Load Balancer.

After successfully deploying ingress-nginx-controller, I just can't create any Ingress resource. Example configuration:

kind: Ingress
apiVersion: networking.k8s.io/v1beta1
metadata:
  name: test-nginx
  namespace: default
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
    - host: mydomain.com
      http:
        paths:
          - path: /
            pathType: ImplementationSpecific
            backend:
              serviceName: test-nginx
              servicePort: 80

I'm always getting the error:

Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: context deadline exceeded

as everyone suggested I've double checked the security group configs, by default I already have allow all traffic configuration within eks cluster.

image

What am I missing?

amlinux commented 3 years ago

Disclaimer: I'm not an AWS expert, can't say anything specific. From k8s standpoint, Kubernetes' api-server may not be the part of the cluster itself, i.e. the machine where it runs may not be managed by Kubernetes. If you are using managed Kubernetes, it's most likely running outside. You need to figure out where it runs and allow traffic from that machine to pods.

shinebayar-g commented 3 years ago

By default EKS attaches default security group for master and all nodes, which allows all traffic between them. :(

amlinux commented 3 years ago

You need to check how network is set up in EKS. There are differences between how traffic is routed to cluster machines and to pods/services (such as ingress-nginx-controller-admission.ingress-nginx.svc:443) that run on top of them. Addresses for pods and services are allocated separately from machine IP addresses and need to be routed separately.

Master node needs to be aware how to route packets to service IPs and need to have relevant permissions on the firewall. If the service->pod address translation is performed on the master node itself, then the firewall needs to permit traffic from the master node to pod IP ranges. There are many ways now networking can be set up in Kubernetes, and unfortunately I know nothing about EKS.

shinebayar-g commented 3 years ago

Dang, that makes sense, hope some AWS expert would notice this issue .. By default EKS pods runs on same subnet as nodes, which makes them routable within VPC. But I'm using Cilium CNI plugin and pods now has 10.0.0.0/8 IP range. Maybe this could be causing some mess.. Maybe not.

Is admission webhook process running on port 8443 of node or pod?

rzuku commented 3 years ago

I do not have opps to check it now on EKS, however as mentioned before. I think your security group should allow a traffic from control plane to 8443 on the worker nodes. WHata actually I can not see in your post.

revilwang commented 3 years ago

Dang, that makes sense, hope some AWS expert would notice this issue .. By default EKS pods runs on same subnet as nodes, which makes them routable within VPC. But I'm using Cilium CNI plugin and pods now has 10.0.0.0/8 IP range. Maybe this could be causing some mess.. Maybe not.

Is admission webhook process running on port 8443 of node or pod?

I think we're encountering a similar issue

rzuku commented 3 years ago

Is admission webhook process running on port 8443 of node or pod? When you look at its definition it shows 443. However it started to work, when I opened 8443.

shinebayar-g commented 3 years ago

@revilwang i think you're right. When you google EKS CNI admission webhook there are tons of results. cert-manager.io has even FAQ for this.

revilwang commented 3 years ago

@shinebayar-g Thanks for your work. The solution looks a little complicated, and using two CNI plugins at the same time, I don't know, maybe with a bit luck, other issues are waiting😢 . But yeah, many similar networks issues in EKS are caused by third-party cni plugins, e.g., metrics-server cannot work either if pod's IP is allocated from third-party cni plugins.

mau21mau commented 3 years ago

844

How did you do that? I'm running a private cluster on GCP and facing the same issue.

cmluciano commented 3 years ago

@mau21mau

Have you followed the instructinos here https://kubernetes.github.io/ingress-nginx/deploy/#gce-gke

cmluciano commented 3 years ago

/triage needs-information

sharkymcdongles commented 3 years ago

I encounter this in a nonprivate GKE cluster on chart: ingress-nginx-3.11.0 app version: 0.41.2

I0120 15:04:05.459036       1 trace.go:116] Trace[928871849]: "Create" url:/apis/extensions/v1beta1/namespaces/x/ingresses (started: 2021-01-20 15:03:57.693499032 +0000 UTC m=+186354.183959444) (total time: 7.765514825s):
Trace[928871849]: [7.765514825s] [7.765009815s] END
I0120 15:04:05.459392       1 httplog.go:90] POST /apis/extensions/v1beta1/namespaces/x/ingresses?pretty=false: (7.766909845s) 500
goroutine 121127933 [running]:
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).recordStatus(0xc087fc8770, 0x1f4)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog/httplog.go:217 +0xc8
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).WriteHeader(0xc087fc8770, 0x1f4)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog/httplog.go:196 +0x35
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*baseTimeoutWriter).WriteHeader(0xc0238264e0, 0x1f4)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:228 +0xb2
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.(*auditResponseWriter).WriteHeader(0xc141ce7590, 0x1f4)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/audit.go:219 +0x63
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/metrics.(*ResponseWriterDelegator).WriteHeader(0xc09f8e4540, 0x1f4)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/metrics/metrics.go:504 +0x45
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters.(*deferredResponseWriter).Write(0xc0bfb93f40, 0xc11ae9a000, 0x21d, 0xb54da, 0x0, 0x0, 0x1)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go:202 +0x1e6
encoding/json.(*Encoder).Encode(0xc055017848, 0x45c8e40, 0xc09a7db7c0, 0x6, 0x0)
        /usr/local/go/src/encoding/json/stream.go:227 +0x1ca
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/json.(*Serializer).Encode(0xc0003fb540, 0x51a9d00, 0xc09a7db7c0, 0x51933c0, 0xc0bfb93f40, 0x37eec49, 0x6)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/json/json.go:331 +0x2e4
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/versioning.(*codec).Encode(0xc09f52e120, 0x51a9d00, 0xc09a7db7c0, 0x51933c0, 0xc0bfb93f40, 0x0, 0x0)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/versioning/versioning.go:215 +0x323
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters.SerializeObject(0x4679bfb, 0x10, 0x7f8a5c1b3f68, 0xc09f52e120, 0x51fb1a0, 0xc26577e850, 0xc0dc848e00, 0x1f4, 0x51a9d00, 0xc09a7db7c0)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go:96 +0x127
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters.WriteObjectNegotiated(0x51fe1a0, 0xc0121deae0, 0x51fe4e0, 0x785f2f0, 0x466982a, 0xa, 0x46643f7, 0x7, 0x51fb1a0, 0xc26577e850, ...)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go:251 +0x555
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters.ErrorNegotiated(0x5192ca0, 0xc09a7db680, 0x51fe1a0, 0xc0121deae0, 0x466982a, 0xa, 0x46643f7, 0x7, 0x51fb1a0, 0xc26577e850, ...)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go:270 +0x167
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers.(*RequestScope).err(...)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/rest.go:85
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers.createHandler.func1(0x51fb1a0, 0xc26577e850, 0xc0dc848e00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/create.go:170 +0x1b4e
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints.restfulCreateResource.func1(0xc09f8e44b0, 0xc087fc87e0)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/installer.go:1095 +0xe4
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/metrics.InstrumentRouteFunc.func1(0xc09f8e44b0, 0xc087fc87e0)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/metrics/metrics.go:372 +0x254
k8s.io/kubernetes/vendor/github.com/emicklei/go-restful.(*Container).dispatch(0xc000bd2090, 0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/emicklei/go-restful/container.go:288 +0xa4f
k8s.io/kubernetes/vendor/github.com/emicklei/go-restful.(*Container).Dispatch(...)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/emicklei/go-restful/container.go:199
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server.director.ServeHTTP(0x46739c9, 0xe, 0xc000bd2090, 0xc0006d6000, 0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/handler.go:146 +0x4d3
k8s.io/kubernetes/vendor/k8s.io/kube-aggregator/pkg/apiserver.(*proxyHandler).ServeHTTP(0xc008163730, 0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kube-aggregator/pkg/apiserver/handler_proxy.go:118 +0x161
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/mux.(*pathHandler).ServeHTTP(0xc089d1c880, 0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:248 +0x38a
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).ServeHTTP(0xc00c05d030, 0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:234 +0x84
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server.director.ServeHTTP(0x4676c64, 0xf, 0xc00c022fc0, 0xc00c05d030, 0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/handler.go:154 +0x6b1
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAuthorization.func1(0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/authorization.go:64 +0x4f8
net/http.HandlerFunc.ServeHTTP(0xc00c049400, 0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /usr/local/go/src/net/http/server.go:2036 +0x44
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithMaxInFlightLimit.func1(0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/maxinflight.go:160 +0x5dc
net/http.HandlerFunc.ServeHTTP(0xc00c081920, 0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /usr/local/go/src/net/http/server.go:2036 +0x44
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithImpersonation.func1(0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/impersonation.go:50 +0x1fc9
net/http.HandlerFunc.ServeHTTP(0xc00c049440, 0x51fb0a0, 0xc26577e830, 0xc0dc848e00)
        /usr/local/go/src/net/http/server.go:2036 +0x44
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAudit.func1(0x7f8a2c0e2e88, 0xc26577e828, 0xc0dc848d00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/audit.go:110 +0x4b9
net/http.HandlerFunc.ServeHTTP(0xc00c049480, 0x7f8a2c0e2e88, 0xc26577e828, 0xc0dc848d00)
        /usr/local/go/src/net/http/server.go:2036 +0x44
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAuthentication.func1(0x7f8a2c0e2e88, 0xc26577e828, 0xc0dc848b00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/authentication.go:110 +0x6c3
net/http.HandlerFunc.ServeHTTP(0xc00c050fa0, 0x7f8a2c0e2e88, 0xc26577e828, 0xc0dc848b00)
        /usr/local/go/src/net/http/server.go:2036 +0x44
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP.func1(0xc13aa78660, 0xc00c092220, 0x5207fe0, 0xc26577e828, 0xc0dc848b00)
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:113 +0xd0
created by k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP
        /workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:99 +0x1cb

logging error output: "{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"Internal error occurred: failed calling webhook \\\"validate.nginx.ingress.kubernetes.io\\\": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: EOF\",\"reason\":\"InternalError\",\"details\":{\"causes\":[{\"message\":\"failed calling webhook \\\"validate.nginx.ingress.kubernetes.io\\\": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: EOF\"}]},\"code\":500}\n"
 [Swagger-Codegen/1.0-SNAPSHOT/java censoredip:48622]

The masters and vms have full unimpeded connectivity yet I still see this for some reason. Is something from the call crashing the webhook before it can respond? Perhaps the request is causing some sort of race condition or something I dunno. Seems a bit strange.

cmluciano commented 3 years ago

Can you please post more details about the environment ? Things like CNI, k8s version, etc.

sharkymcdongles commented 3 years ago

@cmluciano

It is using Calico CNI on GKE deployed and maintained by them via advanced networking true toggle.

Kubernetes Version is v1.16.9-gke.6

My other webhooks e.g. cert manager and sparkoperator are fine. The nginx one also seems to work most of the time, but sometimes it fails with the above stack trace. Is it possible the way the request is made causes this issue? The endpoint we have deletes an ingress then recreates it afterward to align it with updates to our application ingress template.

On Thu, 21 Jan 2021, 17:46 cmluciano, notifications@github.com wrote:

Can you please post more details about the environment ? Things like CNI, k8s version, etc.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kubernetes/ingress-nginx/issues/5401#issuecomment-764782632, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG6KEUNJSZOLOIBKRR2VVDDS3BK5BANCNFSM4ML2P6LQ .

hydrapolic commented 3 years ago

Got this in GCP on our testing environment (chart 3.23.0 / image 0.44.0 / k8s 1.17.14-gke.1600):

Error: Failed to update Ingress default/service because: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-cx-nginx-1-admission.default.svc:443/networking/v1beta1/ingresses?timeout=10s: x509: certificate is valid for ingress-nginx-controller-admission, ingress-nginx-controller-admission.default.svc, not ingress-nginx-cx-nginx-1-admission.default.svc

Our production runs on chart 3.13.0 / image 0.41.2 where this cannot be reproduced.

As a workaround:

kubectl get validatingwebhookconfigurations
kubectl delete validatingwebhookconfigurations ...
sharkymcdongles commented 3 years ago

I noticed it seems to be that if at any stage it cannot connect it won't retry. I think this may be the problem. Perhaps some sort of retry can be put in place to make the call a few times if first fails.

sharkymcdongles commented 3 years ago

I did more digging and it seems the problem is due to the amount of ingresses we have. We have 219, so I think when it validates it checks existing ones as well causing it to fail intermittently when it cannot check all objects and it has no builtin retries on failure.