dask / helm-chart

Helm charts for Dask
https://helm.dask.org/
92 stars 91 forks source link

Network policies break daskhub #445

Open kcote-ncar opened 9 months ago

kcote-ncar commented 9 months ago

Describe the issue: It appears the default network policies from the jupyterhub helm chart breaks communication with dask-gateway and the kube-apiserver.

I deployed daskhub with default values onto a vanilla K8s cluster with a CNI that supports network policies (cilium). helm upgrade --install --create-namespace --namespace jhub01 jhub01 dask/daskhub

With this deployment, the jupyterhub pod will not spawn and I receive this output: image

Using hubble, I am able to see the packets are being dropped via network policy: hubble observe -n jhub01 -t drop -f

Feb 16 18:05:30.839: jhub01/hub-fc455bdb8-2n7ct:34144 (ID:133898) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN) Feb 16 18:05:31.862: jhub01/hub-fc455bdb8-2n7ct:34144 (ID:133898) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN) Feb 16 18:05:33.910: jhub01/hub-fc455bdb8-2n7ct:34144 (ID:133898) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN) Feb 16 18:05:35.043: jhub01/hub-fc455bdb8-2n7ct:38918 (ID:133898) <> XXX.XXX.XXX.148:6443 (kube-apiserver) Policy denied DROPPED (TCP Flags: SYN) Feb 16 18:05:36.086: jhub01/hub-fc455bdb8-2n7ct:38918 (ID:133898) <> XXX.XXX.XXX.148:6443 (kube-apiserver) Policy denied DROPPED (TCP Flags: SYN)

If I allow access to the kube-apiserver (reference ticket below), the pod will then spawn but I still get drops for dask-gateway communication:

Feb 16 19:40:58.002: jhub01/jupyter-test:34604 (ID:146419) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN) Feb 16 19:41:00.498: jhub01/jupyter-test:53158 (ID:146419) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN) Feb 16 19:41:06.130: jhub01/jupyter-test:34604 (ID:146419) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN) Feb 16 19:41:07.936: jhub01/hub-5fd4dbdb78-gmnvw:58384 (ID:133898) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN) Feb 16 19:41:08.950: jhub01/hub-5fd4dbdb78-gmnvw:58384 (ID:133898) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN) Feb 16 19:41:10.998: jhub01/hub-5fd4dbdb78-gmnvw:58384 (ID:133898) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN) Feb 16 19:41:14.914: jhub01/jupyter-test:43114 (ID:146419) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN) Feb 16 19:41:15.030: jhub01/hub-5fd4dbdb78-gmnvw:58384 (ID:133898) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-hzwrj:8000 (ID:170445) Policy denied DROPPED (TCP Flags: SYN)

Here is the list of network policies defined for the whole cluster: kubectl get networkpolicies.networking.k8s.io -A

NAMESPACE NAME POD-SELECTOR AGE jhub01 hub app=jupyterhub,component=hub,release=jhub01 22h jhub01 proxy app=jupyterhub,component=proxy,release=jhub01 22h jhub01 singleuser app=jupyterhub,component=singleuser-server,release=jhub01 22h

Everything works when I deploy this network policy into the namespace:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-all-ingress-egress
spec:
  podSelector: {}
  egress:
  - {}
  ingress:
  - {}
  policyTypes:
  - Egress
  - Ingress

Anything else we need to know?: Bare Metal - K8s Server Version: v1.29.1 CRI-O Version: v1.29.1 Cilium Version: v1.15.1

This issue is related and is why we are seeing drops for the kube-apiserver:

If I allow access to the kube-apiserver I then hit this issue:

I think that the daskhub chart should deploy network policies to allow the jupyterhub pod to communicate with dask-gateway. Or perhaps something about the correct network policies should be documented since the default values don't allow dask-gateway communication?

Environment:

Ph0tonic commented 8 months ago

Hi, I confirm this issue and I managed to make it work with the following netpol which is a bit more restricted:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: dask-network-policy
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: dask-gateway
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: dask-gateway
    - from:
        - podSelector:
            matchLabels:
              app: jupyterhub
  egress:
    - ports:
        - port: 6443
        - port: 53
          protocol: UDP
    - to:
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: dask-gateway
    - to:
        - podSelector:
            matchLabels:
              app: jupyterhub
kcote-ncar commented 8 months ago

@Ph0tonic Thank you for confirming and providing a more restrictive netpol. Unfortunately, when I apply that netpol on a generic deployment I am still getting packet drops for the hub pod to the traefik pod on port 8000.

Mar 15 12:00:07.234: jhub01/hub-6cdf59cc94-x4zmf:54686 (ID:154794) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-fghxd:8000 (ID:135336) Policy denied DROPPED (TCP Flags: SYN)
Mar 15 12:00:08.278: jhub01/hub-6cdf59cc94-x4zmf:54686 (ID:154794) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-fghxd:8000 (ID:135336) Policy denied DROPPED (TCP Flags: SYN)

If I allow egress port 8000 (via the 6443 workaround) for the hub pod then I receive blocks from the proxy pod to the traefik pod:

Mar 15 13:11:07.854: jhub01/proxy-84cd6496dc-64jvb:47066 (ID:138248) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-cqkbg:8000 (ID:135336) Policy denied DROPPED (TCP Flags: SYN)
Mar 15 13:11:08.886: jhub01/proxy-84cd6496dc-64jvb:47066 (ID:138248) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-cqkbg:8000 (ID:135336) Policy denied DROPPED (TCP Flags: SYN)

If I modify the proxy networkpolicy and allow egress for port 8000 then I receive blocks from the singleuser pod to the traefik pod:

Mar 15 13:14:07.052: jhub01/jupyter-test:49324 (ID:132715) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-cqkbg:8000 (ID:135336) Policy denied DROPPED (TCP Flags: SYN)
Mar 15 13:14:08.109: jhub01/jupyter-test:49324 (ID:132715) <> jhub01/traefik-jhub01-dask-gateway-7665b69c66-cqkbg:8000 (ID:135336) Policy denied DROPPED (TCP Flags: SYN)

I then modified the singleuser networkpolicy to allow egress for port 8000 and everything works.

Since this chart is designed to deploy two sub-charts, I think it should address the networkpolicies to allow correct communications between the two deployments. The root issue is with the jupyterhub egress networkpolicies. I tried to address this with a helm values file:

jupyterhub:
  hub:
    networkPolicy:
      egress:
        - ports:
            - port: 6443
            - port: 8000
  chp:
    networkPolicy:
      egress:
        - ports:
            - port: 8000
  singleuser:
    networkPolicy:
      egress:
        - ports:
            - port: 8000

However, this chart doesn't understand the chp: part which is for the proxy pod that is defined in the jupyterhub chart. If the values file lacks the chp: part then it addresses the hub and singleuser pods but the proxy networkpolicy still needs manual modification.

Ph0tonic commented 8 months ago

Ho, sorry, indeed I only mentioned the config for dask-gateway, here is my config for Jupyterhub Helm Chart:

values:
  proxy:
    chp:
      networkPolicy:
        egress:
          - to:
              - podSelector:
                  matchLabels:
                    app.kubernetes.io/name: dask-gateway
            ports:
              - port: 8000
  singleuser:
    networkPolicy:
      egress:
        - to:
            - podSelector:
                matchLabels:
                  app.kubernetes.io/name: dask-gateway
          ports:
            - port: 8000
  hub:
    networkPolicy:
      egress:
        - ports:
            - port: 6443
        - to:
            - podSelector:
                matchLabels:
                  app.kubernetes.io/name: dask-gateway
          ports:
            - port: 8000

And the additional config as previously provided:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: dask-network-policy
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: dask-gateway
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: dask-gateway
    - from:
        - podSelector:
            matchLabels:
              app: jupyterhub
  egress:
    - ports:
        - port: 6443
        - port: 53
          protocol: UDP
    - to:
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: dask-gateway
    - to:
        - podSelector:
            matchLabels:
              app: jupyterhub

However, I agree that it would be nice to have those config as default.

kcote-ncar commented 8 months ago

@Ph0tonic Thank you! With the values you provided I have managed to make dask-gateway function with jupyterhub.

For others that may come upon this issue, these values work for the daskhub chart:

jupyterhub:
  proxy:
    chp:
      networkPolicy:
        egress:
          - to:
              - podSelector:
                  matchLabels:
                    app.kubernetes.io/name: dask-gateway
            ports:
              - port: 8000
  singleuser:
    networkPolicy:
      egress:
        - to:
            - podSelector:
                matchLabels:
                  app.kubernetes.io/name: dask-gateway
          ports:
            - port: 8000
  hub:
    networkPolicy:
      egress:
        - ports:
            - port: 6443
        - to:
            - podSelector:
                matchLabels:
                  app.kubernetes.io/name: dask-gateway
          ports:
            - port: 8000

I do think these values should be defined by default when deploying the daskhub chart so dask-gateway works with jupyterhub.