loft-sh / vcluster

vCluster - Create fully functional virtual Kubernetes clusters - Each vcluster runs inside a namespace of the underlying k8s cluster. It's cheaper than creating separate full-blown clusters and it offers better multi-tenancy and isolation than regular namespaces.
https://www.vcluster.com
Apache License 2.0
6.26k stars 398 forks source link

Egress NetworkPolicy blocks DNS requests? #390

Closed bbaumgartl closed 2 years ago

bbaumgartl commented 2 years ago

What happened?

After applying an egress network policy to pods they can't do requests to the vcluster coredns anymore.

This can not be mitigated by adding ports 53/udp/tcp to the egress rule.

Other ports like 80, 443 work.

It seems that this is not a port problem but something to do with the internal core dns routing because changing the /etc/resolv.conf inside the container to nameserver 1.1.1.1 works (for external domains).

What did you expect to happen?

DNS requests to the internal coredns should not be blocked, or allowable by an egress rule.

How can we reproduce it (as minimally and precisely as possible)?

Create vcluster with values.yml:

sync:
  networkpolicies:
    enabled: true

Create a test container:

apiVersion: v1
kind: Pod
metadata:
  name: test
  labels:
    component: test
spec:
  containers:
  - name: test
    image: debian
    command: ['/bin/sh', '-c']
    args:
      - |
        apt update
        apt install dnsutils
        sleep infinity

Exec into test container and test dns with dig test or dig google.de.

Add network policy and test it again:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
spec:
  podSelector:
    matchLabels:
      component: test
  policyTypes:
  - Egress
  egress:
  - to:
    ports:
    - port: 53
      protocol: TCP
    - port: 53
      protocol: UDP

Anything else we need to know?

Network policies work on the host cluster.

The CNI is Canal.

We tried different egress network policies:

Host cluster Kubernetes version

```console $ kubectl version Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"archive", BuildDate:"2022-01-27T18:26:18Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:19:12Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"} ```

Host cluster Kubernetes distribution

``` Kubernetes v1.23.3 ```

vlcuster version

``` vcluster 0.6.0 ```

Vcluster Kubernetes distribution(k3s(default)), k8s, k0s)

``` k3s ```

OS and Arch

``` OS: Talos v0.14 Arch: x86_64 ```
FabianKramm commented 2 years ago

@bbaumgartl thanks for creating this issue! I guess since we are exposing coredns at 1053 instead of 53 this makes problems (check https://github.com/loft-sh/vcluster/blob/main/manifests/coredns/coredns.yaml#L136). Does it work if you allow port 1053?

bbaumgartl commented 2 years ago

I was looking at the service and didn't see that the port is different for the deployment/pods.

Out of curiosity is there a reason that the deployment uses a different port?

FabianKramm commented 2 years ago

@bbaumgartl yes the reason is that we can run coredns as non root without any capabilities, but for this to work it cannot listen on any 1-1024 ports