`hub` pod unable to establish connections to k8s api-server etc on port 6443 with Cilium

Ph0tonic commented 1 year ago

Bug description

Default Kube-Spwaner is not able to spawn any user pod, it fails while attempting to create the PVC with a TimeoutError.

Expected behaviour

Should be able to able to spawn pods.

Analysis

After some research, I identified that my problem was linked with the netpol egress config of the hub. Here are a few cilium logs of dropped packets :

xx drop (Policy denied) flow 0xdd60fbdd to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:53354 -> 148.187.17.13:6443 tcp ACK
xx drop (Policy denied) flow 0xf15f4f3e to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:53354 -> 148.187.17.13:6443 tcp ACK
xx drop (Policy denied) flow 0x409d160a to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:38278 -> 148.187.17.16:6443 tcp ACK
xx drop (Policy denied) flow 0x9f34c210 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:38278 -> 148.187.17.16:6443 tcp ACK
xx drop (Policy denied) flow 0x2a3106d0 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:53354 -> 148.187.17.13:6443 tcp ACK
xx drop (Policy denied) flow 0xecb1ac78 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:53354 -> 148.187.17.13:6443 tcp ACK
xx drop (Policy denied) flow 0x46c3f486 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:57970 -> 148.187.17.16:6443 tcp SYN
xx drop (Policy denied) flow 0x4cf3b758 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:57970 -> 148.187.17.16:6443 tcp SYN
xx drop (Policy denied) flow 0xc3f5697 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:53354 -> 148.187.17.13:6443 tcp ACK
xx drop (Policy denied) flow 0x7dd8720c to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:53354 -> 148.187.17.13:6443 tcp ACK
xx drop (Policy denied) flow 0x73b2516d to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:57978 -> 148.187.17.17:6443 tcp SYN
xx drop (Policy denied) flow 0xdad4447 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:57978 -> 148.187.17.17:6443 tcp SYN
xx drop (Policy denied) flow 0x86e0bb3d to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:57980 -> 148.187.17.13:6443 tcp SYN
xx drop (Policy denied) flow 0xe8379e6 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:38278 -> 148.187.17.16:6443 tcp ACK
xx drop (Policy denied) flow 0x1452fdbf to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:57980 -> 148.187.17.13:6443 tcp SYN
xx drop (Policy denied) flow 0x333532eb to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:38278 -> 148.187.17.16:6443 tcp ACK
xx drop (Policy denied) flow 0x6506036d to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:41318 -> 148.187.17.17:6443 tcp SYN
xx drop (Policy denied) flow 0xaafea84e to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:41318 -> 148.187.17.17:6443 tcp SYN
xx drop (Policy denied) flow 0x1637f8ef to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:41332 -> 148.187.17.17:6443 tcp SYN
xx drop (Policy denied) flow 0x7f75fa32 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:53354 -> 148.187.17.13:6443 tcp ACK
xx drop (Policy denied) flow 0x20c2f3a8 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:41332 -> 148.187.17.17:6443 tcp SYN
xx drop (Policy denied) flow 0x2339ea83 to endpoint 0, file bpf_lxc.c line 1181, , identity 36262->kube-apiserver: 10.42.5.108:53354 -> 148.187.17.13:6443 tcp ACK

After some research I identified that destination addresses belonged to kube-apiserver, kube-proxy and kube-controller-manager.

To fix the issue I identified that the problem lay in the egress and not in the ingress part. And managed to find a fix:

hub:
  networkPolicy:
    egress:
      - ports:
          - port: 6443

The issue is that the hub tries to access the kube-apiserver to generate a PVC but the request is blocked by the egress configuration.

I am surprised that @vizeit did not have this issue in #3167.

Your personal set up

I am using the latest v3.0.0 version of this helm chart with cilium.

Full environment

``` # paste output of `pip freeze` or `conda list` here ```

Configuration

```python # jupyterhub_config.py ```

Logs

``` # paste relevant logs here, if any ```

consideRatio commented 1 year ago

Did you run into this in a GKE based cluster using Cilium via GCP's dataplane v2, or was this a cluster setup in another way?

Ph0tonic commented 1 year ago

Ok so I do not think that it is a GKE based cluster, sorry, I am not familiar with cluster but what I found is that the runtime engine is containerd://1.6.15-k3s1 and cilium is configured.

consideRatio commented 1 year ago

Ah, its a k3s based cluster. Then i think the main issue is that network policies are enforced at all (cilium, calico), but that access is restricted to the k8s internals there but not in other clusters.

vizeit commented 1 year ago

@Ph0tonic existing core network policy takes care of kube api server egress on GKE. I have been testing JupyterHub on GKE Auopilot for a few weeks now and do not see any other issues so far. You can check the details in my post, note the K8sAPIServer

https://www.vizeit.com/troubleshooting-cilium-on-gke/

vizeit commented 1 year ago

I have not installed k3s and tested but I think changing the server port to 443 should resolve this issue without any additional policy. I am including the reference links below

[1]https://kubernetes.io/docs/concepts/security/controlling-access/#transport-security

[2]https://docs.k3s.io/cli/server#listeners

Ph0tonic commented 1 year ago

Thanks @vizeit, I will have a look at these configurations and see if it fixes my problem.

Ph0tonic commented 1 year ago

So I looked at your link and it the difference between 443 and 6443 was not really clear to me. I found https://github.com/kubernetes/website/issues/30725 which clarifies this. So from my understanding, 443 should be used as an exposed external port.

I see 2 possibilities :

Add egress port to 6443
Add some documentation to clarify the need for this egress rule.

vizeit commented 1 year ago

@Ph0tonic Were you able to test with port 443 to confirm that it works with the existing core network policy?

bauerjs1 commented 1 year ago

I can reproduce this problem with Cilium on a bare-metal cluster. Disabling the hub NetPol in the Helm chart is my workaround so far.

Access to the API server from pods inside the cluster goes through https://kubernetes.default:443 and I can only curl that from within the jupyterhub container, if the NetPol is disabled (and only then, JH is working properly).

The kubernetes.defailt service has a ClusterIP of 10.233.0.1. The NetPol is quite hard to read since there are many overlapping rules. However, looking at it in https://editor.networkpolicy.io/, I cannot find a rule that would allow traffic to this IP (unfortunately, I can't post the image).

Ph0tonic commented 9 months ago

Hi, Sorry @vizeit for the late reply. I did not had the possibility and rights to change the cluster config from 6443 to 443, so I could not test it.

The solution which work for me is the following config:

hub:
  networkPolicy:
    egress:
      - ports:
          - port: 6443

vizeit commented 9 months ago

Hi, Sorry @vizeit for the late reply. I did not had the possibility and rights to change the cluster config from 6443 to 443, so I could not test it.

The solution which work for me is the following config:
hub:
  networkPolicy:
    egress:
      - ports:
          - port: 6443

@Ph0tonic no problem

lahwaacz commented 4 months ago

Add some documentation to clarify the need for this egress rule.

Trying to clarify this:

By default, z2jh allows all egress traffic except private IP ranges:

egress:
 - to:
   - ipBlock:
       cidr: 0.0.0.0/0
       except:
       - 10.0.0.0/8
       - 172.16.0.0/12
       - 192.168.0.0/16
       - 169.254.169.254/32

The kubernetes.default domain name used by the hub always resolves to an IP from the private range, e.g. 10.96.0.1. The public IP range of the Kubernetes API endpoint may also be from one of the private IP ranges, see e.g. Anatomy of the kubernetes.default
Hence, an extra egress rule is needed to allow the hub to connect to the Kubernetes API.

The following egress rule mentioned by @Ph0tonic works, but it allows connections to any host on port 6443, not only the Kubernetes API:

hub:
  networkPolicy:
    egress:
      - ports:
          - port: 6443

Alternatively, a CiliumNetworkPolicy can be used to filter traffic specifically from the hub pod to the Kubernetes API:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-access-to-api-server
  namespace: jupyterhub-test
spec:
  egress:
  - toEntities:
    - kube-apiserver
  endpointSelector:
    matchLabels:
      app: jupyterhub
      component: hub

Also note that the same policy should be added also for the image-puller and user-scheduler components for which the chart does not specify any network policy. This is important especially when you want to add a default deny-all policy for the namespace.

jupyterhub / zero-to-jupyterhub-k8s