When enabling NodeLocal DNSCache DNS requests are being blocked

liorfranko commented 4 years ago

After enabling the NodeLocal DNSCache feature, requests from pods to kube-dns SVC are getting blocked by Calico Policy. The NodeLocal DNS pods are deployed exactly with the same labels as the coredns pods

Steps to Reproduce (for bugs)

Deploy a K8S cluster.
Enforce the cluster using Calico GlobalNetworkPolicy
Enable NodeLocal DNSCache

Your Environment

Calico version: 3.14.1
Kubernetes: 1.17.4
Operating System and version: Centos/Ubuntu https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/

liorfranko commented 4 years ago

Does anyone have any issues/successes when working with Calico security policy and NodeLocal DNSCache? I know that the NodeLocal DNSCache makes changes to the iptables, maybe there is a conflict with Calico?

caseydavenport commented 4 years ago

@liorfranko I'm not super familiar with the node-local DNS cache. How is it deployed? Is it a host-networked pod? What requests specifically are you seeing being blocked?

You could also try removing your policy or creating an "allow all" policy to make sure it's policy that is blocking the requests and not something else.

liorfranko commented 4 years ago

Alongside the regular KubeDNS SVC, there is a daemonset of DNS pods on each node. During the deployment on the NodeLocal DNSCache, the pod manipulate the iptables of the nodes and "hijack" the DNS queries. It then, either respond from a local cache or queries the KubeDNS SVC on behalf of the pod.

I know that hijacking the DNS queries, is a security breach, but it's a K8S official feature.

caseydavenport commented 4 years ago

@liorfranko can you find the exact rule that is blocking the traffic? e.g., with itpables-save -c to view which rules are / are not getting hit?

Enforce the cluster using Calico GlobalNetworkPolicy

Could you also share the GNP that you created? It's possible that this is "working as expected" if your policy selects the local DNS pods and doesn't allow the necessaray traffic.

liorfranko commented 4 years ago

This is the configuration that works:

[root@raor-kmb01 liorf]# kubectl -n kube-system describe svc kube-dns
Name:              kube-dns
Namespace:         kube-system
Labels:            addonmanager.kubernetes.io/mode=EnsureExists
                   k8s-app=kube-dns
                   kubernetes.io/cluster-service=true
                   kubernetes.io/name=CoreDNS
                   security_role=dc
Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"addonmanager.kubernetes.io/mode":"EnsureExists","k8s-app":"kub...
Selector:          k8s-app=kube-dns
Type:              ClusterIP
IP:                10.48.60.4
Port:              dns  53/UDP
TargetPort:        53/UDP
Endpoints:         10.87.209.76:53,10.87.212.38:53,10.87.212.62:53 + 2 more...
Port:              dns-tcp  53/TCP
TargetPort:        53/TCP
Endpoints:         10.87.209.76:53,10.87.212.38:53,10.87.212.62:53 + 2 more...
Session Affinity:  None
Events:            <none>
[root@raor-kmb01 liorf]#

[root@raor-kmb01 liorf]# kubectl -n kube-system get pods -o wide -l security_role=dc
NAME                       READY   STATUS    RESTARTS   AGE   IP             NODE          NOMINATED NODE   READINESS GATES
coredns-84b58dd875-k8p2s   1/1     Running   0          28d   10.87.209.76   rapr-knb403   <none>           <none>
coredns-84b58dd875-sd5vv   1/1     Running   0          28d   10.87.221.7    rapr-knb402   <none>           <none>
coredns-84b58dd875-xr8pb   1/1     Running   0          28d   10.87.212.62   rapr-knb404   <none>           <none>
coredns-84b58dd875-z2vnx   1/1     Running   0          28d   10.87.221.23   rapr-knb402   <none>           <none>
coredns-84b58dd875-zkss4   1/1     Running   0          28d   10.87.212.38   rapr-knb404   <none>           <none>
[root@raor-kmb01 liorf]#

[root@raor-kmb01 liorf]# calicoctl get gnp allow-cluster-dns-egress -o yaml
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  creationTimestamp: "2020-04-06T15:49:01Z"
  name: allow-cluster-dns-egress
  resourceVersion: "289083783"
  uid: 2275f35a-781e-11ea-8d3f-6c96cfdd9a83
spec:
  egress:
  - action: Allow
    destination:
      ports:
      - 53
      selector: security_role == 'dc'
    protocol: UDP
    source: {}
  - action: Allow
    destination:
      ports:
      - 53
      selector: security_role == 'dc'
    protocol: TCP
    source: {}
  order: 110
  types:
  - Egress
[root@raor-kmb01 liorf]# calicoctl get gnp allow-cluster-dns-ingress -o yaml
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  creationTimestamp: "2020-04-06T15:49:01Z"
  name: allow-cluster-dns-ingress
  resourceVersion: "289083784"
  uid: 2282ced2-781e-11ea-8d3f-6c96cfdd9a83
spec:
  ingress:
  - action: Allow
    destination:
      ports:
      - 53
      selector: security_role == 'dc'
    protocol: UDP
    source: {}
  - action: Allow
    destination:
      ports:
      - 53
      selector: security_role == 'dc'
    protocol: TCP
    source: {}
  order: 120
  selector: security_role == 'dc'
  types:
  - Ingress

Here are the iptables rules:

:cali-po-_5m-r2tA7lULiAKgDJYp - [0:0]
[0:0] -A cali-po-_5m-r2tA7lULiAKgDJYp -p udp -m comment --comment "cali:TlFil7fjAMGcv5Q5" -m set --match-set cali40s:8ossrPQLjDgMAY-ksqdt5w_ dst -m multiport --dports 53 -j MARK --set-xmark 0x10000/0x10000
[0:0] -A cali-po-_5m-r2tA7lULiAKgDJYp -m comment --comment "cali:gIhEgQMSynKo-7bZ" -m mark --mark 0x10000/0x10000 -j RETURN
[0:0] -A cali-po-_5m-r2tA7lULiAKgDJYp -p tcp -m comment --comment "cali:cHcDw9oz_M13WRhm" -m set --match-set cali40s:8ossrPQLjDgMAY-ksqdt5w_ dst -m multiport --dports 53 -j MARK --set-xmark 0x10000/0x10000
[0:0] -A cali-po-_5m-r2tA7lULiAKgDJYp -m comment --comment "cali:ByOGEfuwNe7-t3Rh" -m mark --mark 0x10000/0x10000 -j RETURN
:cali-po-_5m-r2tA7lULiAKgDJYp - [0:0]
[0:0] -A cali-po-_5m-r2tA7lULiAKgDJYp -p udp -m comment --comment "cali:TlFil7fjAMGcv5Q5" -m set --match-set cali40s:8ossrPQLjDgMAY-ksqdt5w_ dst -m multiport --dports 53 -j MARK --set-xmark 0x10000/0x10000
[0:0] -A cali-po-_5m-r2tA7lULiAKgDJYp -m comment --comment "cali:gIhEgQMSynKo-7bZ" -m mark --mark 0x10000/0x10000 -j RETURN
[0:0] -A cali-po-_5m-r2tA7lULiAKgDJYp -p tcp -m comment --comment "cali:cHcDw9oz_M13WRhm" -m set --match-set cali40s:8ossrPQLjDgMAY-ksqdt5w_ dst -m multiport --dports 53 -j MARK --set-xmark 0x10000/0x10000
[0:0] -A cali-po-_5m-r2tA7lULiAKgDJYp -m comment --comment "cali:ByOGEfuwNe7-t3Rh" -m mark --mark 0x10000/0x10000 -j RETURN
:cali-po-_5m-r2tA7lULiAKgDJYp - [0:0]

This is works perfect before the NodeLocal DNSCache.

Adding the NodeLocal DNSCache, just add a cache pod on each node.

[root@raor-kmb01 liorf]# kubectl -n kube-system get pods -o wide -l security_role=dc
NAME                       READY   STATUS    RESTARTS   AGE   IP             NODE          NOMINATED NODE   READINESS GATES
coredns-84b58dd875-k8p2s   1/1     Running   0          28d   10.87.209.76   rapr-knb403   <none>           <none>
coredns-84b58dd875-sd5vv   1/1     Running   0          28d   10.87.221.7    rapr-knb402   <none>           <none>
coredns-84b58dd875-xr8pb   1/1     Running   0          28d   10.87.212.62   rapr-knb404   <none>           <none>
coredns-84b58dd875-z2vnx   1/1     Running   0          28d   10.87.221.23   rapr-knb402   <none>           <none>
coredns-84b58dd875-zkss4   1/1     Running   0          28d   10.87.212.38   rapr-knb404   <none>           <none>
node-local-dns-57z67       1/1     Running   0          18m   10.48.56.10    rapr-knb402   <none>           <none>
node-local-dns-76mqt       1/1     Running   0          18m   10.48.57.11    raor-kmb02    <none>           <none>
node-local-dns-crv7z       1/1     Running   0          18m   10.48.56.9     rapr-knb401   <none>           <none>
node-local-dns-gkz4p       1/1     Running   0          18m   10.48.56.11    rapr-knb403   <none>           <none>
node-local-dns-p86bs       1/1     Running   0          18m   10.48.57.12    raor-kmb01    <none>           <none>
node-local-dns-pr47t       1/1     Running   0          18m   10.48.57.10    raor-kmb03    <none>           <none>
node-local-dns-rhxsp       1/1     Running   0          18m   10.48.56.8     rapr-knb400   <none>           <none>
node-local-dns-w5srg       1/1     Running   0          18m   10.48.56.12    rapr-knb404   <none>           <none>

Now, each node-local-dns pod "hijack" the DNS requests and either respond from cache or send them to the kube-dns SVC and then respond.

The blocked traffic is from the application pods to the kube-dns SVC ip.

dostalradim commented 3 years ago

Same issue happend to me. But I have no GlobalNetworkPolicy.

dostalradim commented 3 years ago

Noo I am sorry, my mistake. I change kube-proxy mode last week a did not change config of node local dns.

Sorry

caseydavenport commented 3 years ago

I think there are some incompatibilities with the node-local DNS cache's use of NOTRACK iptables rules, so keeping this issue to track making Calico work with node-local DNS.

kfirfer commented 3 years ago

I have GlobalNetworkPolicy and the problem is happens to me aswell This is the policies I have used:

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: ingress-k8s-masters
spec:
  selector: has(node-role.kubernetes.io/master)
  # This rule allows ingress to the Kubernetes API server.
  ingress:
  - action: Allow
    protocol: TCP
    destination:
      ports:
      # kube API server
      - 6443
      - 8443
      - 53 # DNS
      - 9100 # prometheus-node-exporter
      # metrics-server
      - 443
      - 9443
  - action: Allow
    protocol: UDP
    destination:
      ports:
      - 53 # DNS
  - action: Allow
    destination:
      nets:
      - 127.0.0.1/32
  - action: Allow
    protocol: TCP
    source:
      selector: has(node-role.kubernetes.io/master)
    destination:
      ports:
      - 2380
      - 10250

fasaxc commented 3 years ago

Reported to k8s: https://github.com/kubernetes/kubernetes/issues/98758

fasaxc commented 3 years ago

@kfirfer we're missing part of the puzzle here:

You have a global network policy but what are you applying that policy to? Are you using host endpoints?
What other policy do you have in place? Do you have any egress policy? One of the effects of notrack is that response packets will not be auto-allowed so you need to make your rules symmetric. If you have a rule that "allows dest port 53" the you need an egress rule that allows "from source port 53".

kfirfer commented 3 years ago

@fasaxc

Yes Im using automatic host endpoints:

vostro@dev101:~/code/kfirfer/helm-charts$ calicoctl get heps -owide
NAME             NODE    INTERFACE   IPS                              PROFILES                      
nuc01-auto-hep   nuc01   *           192.168.200.101,172.16.207.64    projectcalico-default-allow   
nuc02-auto-hep   nuc02   *           192.168.200.102,172.16.137.128   projectcalico-default-allow   
nuc03-auto-hep   nuc03   *           192.168.200.103,172.16.206.194   projectcalico-default-allow

I dont have egress GlobalNetworkPolicy rules This is my GlobalNetworkPolicy:

vostro@dev101:~/code/kfirfer/helm-charts$ kubectl get globalnetworkpolicies.crd.projectcalico.org default.ingress-k8s-masters -o yaml
apiVersion: crd.projectcalico.org/v1
kind: GlobalNetworkPolicy
metadata:
  annotations:
    projectcalico.org/metadata: '{"uid":"e6cfb3f2-ccbb-4302-9c1a-c49d11b3d22f","creationTimestamp":"2021-02-09T22:57:57Z"}'
  creationTimestamp: "2021-02-09T22:57:58Z"
  generation: 1
  managedFields:
  - apiVersion: crd.projectcalico.org/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:projectcalico.org/metadata: {}
      f:spec:
        .: {}
        f:ingress: {}
        f:selector: {}
        f:types: {}
    manager: Go-http-client
    operation: Update
    time: "2021-02-09T22:57:58Z"
  name: default.ingress-k8s-masters
  resourceVersion: "10777643"
  uid: e6cfb3f2-ccbb-4302-9c1a-c49d11b3d22f
spec:
  ingress:
  - action: Allow
    destination:
      ports:
      - 6443
      - 8443
      - 53
      - 9100
      - 443
      - 9443
      - 7472
    protocol: TCP
    source: {}
  - action: Allow
    destination:
      ports:
      - 53
    protocol: UDP
    source: {}
  - action: Allow
    destination:
      nets:
      - 127.0.0.1/32
    source: {}
  - action: Allow
    destination:
      ports:
      - 2380
      - 10249
      - 10250
      - 10251
    protocol: TCP
    source:
      selector: has(node-role.kubernetes.io/master)
  selector: has(node-role.kubernetes.io/master)
  types:
  - Ingress

bjetal commented 3 years ago

I ran into the same issue. It appears to me to be a problem because nodelocaldns does some unusual things with its network, in particular using a "link local" IP address instead of the normal Pod IP address. This was actually causing two separate issues:

The DNS traffic is being blocked from other pods.
The container health check was being blocked and causing it to fail.

Logging of the dropped packets showed that in both cases the link local IP address was showing up as the destination (and in the case of the health check, the source as well). To work around this, I added the following to a GlobalNetworkPolicy with default selector:

  - action: Allow
    metadata:
      annotations:
        traffic: nodelocaldns UDP DNS
    protocol: UDP
    destination:
      ports: [53]
      nets: [169.254.25.10/32]
  - action: Allow
    metadata:
      annotations:
        traffic: nodelocaldns TCP DNS
    protocol: TCP
    destination:
      ports: [53]
      nets: [169.254.25.10/32]
  - action: Allow
    metadata:
      annotations:
        traffic: nodelocaldns internal/health check
    destination:
      nets: [169.254.25.10/32]
    source:
      nets: [169.254.25.10/32]

169.254.25.10 is the the link local IP being used by nodelocaldns in my installation (and I believe the default).

bjetal commented 3 years ago

I realized as I Iooked further into Calico that this is a problem at least in part because the nodelocaldns pod runs with hostNetwork: true). This prevents Calico from generating pod specific firewall rules for it.

bjetal commented 3 years ago

I came up with another, possibly better solution for this issue: set the FELIX_CHAININSERTMODE to Append. Nodelocaldns installs its own IPTables rules, but they get hidden by Calico when it keeps its rules first in the filter table's INPUT chain.

NOTE: this does not fix the problem with the health check. My preference there is some way of allowing all traffic on the loopback interface to pass.

caseydavenport commented 8 months ago

For what it's worth, policy that attempts to apply to DNS traffic needs to work differently with node local DNS.

IIRC from the last time I looked at this:

egress rules applied to pods will see the kube-dns Service cluster IP as the destination.
Ingress rules applied to kube-dns pods will see the Node's IP as the source.
HostEndpoint policy needs to be careful to both allow pods to access the local cache endpoint, as well as allow host->kube-dns traffic.

projectcalico / calico

When enabling NodeLocal DNSCache DNS requests are being blocked #3795

Steps to Reproduce (for bugs)

Your Environment