jupyterhub / zero-to-jupyterhub-k8s

Helm Chart & Documentation for deploying JupyterHub on Kubernetes
https://zero-to-jupyterhub.readthedocs.io
Other
1.5k stars 788 forks source link

Regression for `singleuser.cloudMetadata.blockWithIptables` in z2jh 3.3.0 and 3.3.1 - workaround in 3.3.2 #3368

Open consideRatio opened 3 months ago

consideRatio commented 3 months ago

Current state

In 3.3.2 we now pin to alpine:3.18 in our network-tools image, its used by an init container for user pods to block traffic to the cloud metadata server. iptables is in alpine:3.19 working in "nt_tables" mode as compared to "legacy" mode, and that has been seen to cause failures at least on GKE 1.27 nodes.

For now, we rely on pinning alpine to the old version - but I figure ideally if we can we should do the same thing with modern dependencies and "nt_tables" instead.

Initial investigation leading to workaround

Expand to se initial investigation ### `iptables` binary version updated ```shell docker run -it --rm quay.io/jupyterhub/k8s-network-tools:3.2.1 iptables --version iptables v1.8.9 (legacy) docker run -it --rm quay.io/jupyterhub/k8s-network-tools:3.3.1 iptables --version iptables v1.8.10 (nf_tables) ``` ### Error logs with `iptables v1.8.10` (nf_tables) ``` Warning: Extension tcp revision 0 not supported, missing kernel module? iptables v1.8.10 (nf_tables): RULE_APPEND failed (No such file or directory): rule in chain OUTPUT ``` ### Dockerfile https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/9e5dec64f65369fd9fbbb084b748f220f9e75ead/images/network-tools/Dockerfile#L1-L5 ### Image command https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/9e5dec64f65369fd9fbbb084b748f220f9e75ead/jupyterhub/files/hub/jupyterhub_config.py#L433-L445 ## Analysis - The error was and wasn't observed on the same k8s node when switching between images with old/new iptables. - `iptables --help` emit the same thing except a single `(legacy)` / `(nf_tables)` difference between versions. - `iptables` installed in alpine 3.18 is (legacy), while it becomes (nf_tables) in alpine 3.19 ``` docker run --rm iptables:3.18 iptables --version iptables v1.8.9 (legacy) docker run --rm iptables:3.19 iptables --version iptables v1.8.10 (nf_tables) ``` I figure the short quick fix is to pin alpine to 3.18, and then we have an issue of transitioning that we don't have to rush out.
consideRatio commented 3 months ago

What is the long term fix

I'm not sure.