nginxinc / kubernetes-ingress

NGINX and NGINX Plus Ingress Controllers for Kubernetes
https://docs.nginx.com/nginx-ingress-controller
Apache License 2.0
4.68k stars 1.97k forks source link

Upgrade to version 2.2.0 or higher causes "Address family not supported by protocol" #2970

Closed jppitout closed 2 years ago

jppitout commented 2 years ago

Describe the bug On Kubernetes clusters with IPv6 disabled an upgrade to version 2.2.0 and higher causes: [::]:80 failed (97: Address family not supported by protocol

To Reproduce Steps to reproduce the behavior:

  1. Disable IPv6 on Kubernetes cluster
  2. Upgrade kubernetes-ingress to v2.2.0
  3. Pods CrashLoopBackOff
  4. Check pod logs: 2022/08/25 12:58:31 [emerg] 12#12: socket() [::]:80 failed (97: Address family not supported by protocol)

Expected behavior A way to cater for Kubernetes clusters without IPv6 either in the configmap or server-snippets. We cannot find a way to disable NGINX's IPv6 implementation currently.

Environment

github-actions[bot] commented 2 years ago

Hi @jppitout thanks for reporting!

Be sure to check out the docs while you wait for a human to take a look at this :slightly_smiling_face:

Cheers!

brianehlert commented 2 years ago

Beginning with the 2.2 release a default IPv6 listener is created for nginx. This was to align with K8s making IPv6 a first class citizen and giving controls to administrators to define both IPv4 or IPv6 address pools.

Since the determination of whether or not a pod receives an IPv6 address was fully exposed through the K8s cluster configuration, and a pod would not receive an IPv6 address unless an IPv6 address pool was defined, and that that you have to go back quite a few years into unsupported distros to not have an IPv6 stack present - we considered it safe to default to defining an IPv6 listener.

If I am understanding your scenario, you are not blocking IPv6 address assignment through the normal networking configuration files, but rather through some method of blocking the IPv6 stack from loading at all at the kernel level of the K8s node. And since pods share the node of the kernel of the system they run on, this has the downstream impact of the pods being blocked from loading their IPv6 stack. And thus, when nginx attempts to start, the IPv6 stack is not present in the pod and thus the binding for the listener cannot be made.

@jppitout, Can you help me understand two things?

  1. How have you disabled IPv6 on your Nodes (in the image you use for your nodes)?
  2. Why have you disabled it the way that you did?

That said, and those questions asked, we will be introducing a way to give you more control over the listeners in an upcoming release. And thus fully disable the IPv6 listener if you need to.

jppitout commented 2 years ago

@brianehlert thank you for your response.

Some prerequisite knowledge: VMware TKGi's implementation of Kubernetes is deployed using BOSH. The base OS of the Kubernetes nodes are known as BOSH stemcells and are basically prepackaged OS images.

  1. IPV6 support on BOSH stemcells are baked into the kernel but disabled via GRUB bootloader option, module blacklist, and sysctl disabling.
  2. This is specific to VMware TKGi. It has been disabled upstream and we cannot easily enable it nor do we have reason to at this time.

For now we'll stick to version NGINX 2.1.2, thanks for clarifying!

brianehlert commented 2 years ago

@jppitout Thank you very much for that most excellent explanation. I didn't realize that. We have Tanzu customers that have not reported this. So I obviously need to learn more about the options and limitations there.

jppitout commented 2 years ago

@brianehlert No problem, I suspect that TKG which uses Cluster API will not have this problem only TKGi which uses BOSH.

jppitout commented 2 years ago

Hi @brianehlert

Any feed back on this? We'd like to upgrade to 2.4.0 for other bug fixes but cannot due to this bug.

UPDATE Apologies I see the was an argument added to 2.4.0 https://github.com/nginxinc/kubernetes-ingress/pull/3040

brianehlert commented 2 years ago

New command line option added in 2.4

jppitout commented 2 years ago

Hi @brianehlert,

With version 2.4.0 we're still seeing below in the logs even with -disable-ipv6=true argument set and we're getting 404s to the upstreams: 2022/10/11 07:04:53 [emerg] 12#12: socket() [::]:80 failed (97: Address family not supported by protocol)

On the pods themselves we're still seeing ipv6 listeners configured:

nginx@nginx-ingress-57c678c956-9bd2k:/etc/nginx$ grep -ir 'listen \[::\]:80;' *
conf.d/default-dex-k8s-authenticator.conf:      listen [::]:80;
conf.d/default-dex.conf:        listen [::]:80;
conf.d/vault-vault.conf:        listen [::]:80;
...

Any ideas what could be causing this?