aws / aws-network-policy-agent

Apache License 2.0
43 stars 28 forks source link

Allowing ICMP in network policy #90

Open ankenyr opened 11 months ago

ankenyr commented 11 months ago

Currently it seems that ICMP is not allowed as a protocol in a network policy but it seems to be at least partially implemented and planned for.

https://github.com/aws/aws-network-policy-agent/blob/d764cafebca031cadd28eaa962e9af524edb2298/pkg/utils/utils.go#L21 https://github.com/aws/aws-network-policy-agent/blob/d764cafebca031cadd28eaa962e9af524edb2298/pkg/ebpf/events/events.go#L118

However when attempting to use protocol ICMP in the following way

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  namespace: cnitest
  name: demo
spec:
  podSelector:
    matchLabels: {}
  policyTypes:
  - Egress
  egress:
    - to:
        - ipBlock:
            cidr: 8.8.8.8/32
      ports:
        - protocol: ICMP

It results in an error message The NetworkPolicy "demo-app-deny-all" is invalid: spec.egress[0].ports[0].protocol: Unsupported value: "ICMP": supported values: "TCP", "UDP", "SCTP"

This is an issue for any network using IPv6. RFC4890 states that certain ICMP messages should always be allowed while others may be filtered.

As soon as a filter is applied it negates all ICMP traffic unless the protocol is not specified and even then, only to the network or pods that are targeted. This makes it impossible to allow ICMP within the cluster while only allowing specific ports and protocols for direct communication.

achevuru commented 11 months ago

ICMP isn't supported in the Kubernetes Network policy resource as the above error indicates. For IPv6, we allow ICMPv6 by default to allow Neighbor Discovery (NS/NA) packets.

ankenyr commented 11 months ago

While it may not be listed in the K8s network policy that does not mean it cannot be supported. Is this something that AWS would expose to allow support for? It seems like most things are in place to allow this. My understanding is that most CNI's take the base K8s network policy as a starting point and then add to it.

achevuru commented 11 months ago

VPC CNI's Network Policy implementation is based on upstream Kubernetes Network Policy resource and unless the upstream resource is expanded to support other protocols an user will not be able to specify other protocols (as you saw from the above validation error). For now, they would've to leave the protocol field empty and limit the policy to IP and Ports for protocols that aren't currently supported. Supporting another protocol in Network Policy Agent is a minor task as long as an user is allowed to specify it in the upstream resource.

Other implementations might support it if they have their own Custom resource (CRD) support and they will be able to support any custom features via that.

sidewinder12s commented 11 months ago

@achevuru Are you aware of any KEPs around expansion of NetworkPolicy support for this and related items?

achevuru commented 11 months ago

@ sidewinder12s No, I'm not aware of any plans to include ICMP in the upstream NP resource.

atilsensalduz commented 10 months ago

Hi everyone, I came across a following deny log in my environment, and I'm a bit puzzled because ICMP packets shouldn't be blocked by the network policy. Any insights on how this might have happened? πŸ™ƒ

Node: ip-10-0-0-44.ec2.internal;SIP: 172.40.159.252;SPORT: 0;DIP: 10.0.128.28;DPORT: 0;PROTOCOL: ICMP;PolicyVerdict: DENY


achevuru commented 10 months ago

@atilsensalduz Network policy is essentially an allow list (i.e.,) you can't explicitly allow ICMP flows. So, if in your network policy, you're only allowing say TCP then that implies that we block every other port and protocol combination.

I would recommend to check the walk through around sample policy out for reference - https://kubernetes.io/docs/concepts/services-networking/network-policies/#networkpolicy-resource

atilsensalduz commented 10 months ago

Got it! I'll remove the protocol and port section from my network rule to allow ICMP traffic. It's a bit of a head-scratcher why my service wants to ping another service, both based on Java Spring. Anyone encountered this requirement before? πŸ˜…

jdoylei commented 10 months ago

@achevuru - You mentioned:

For IPv6, we allow ICMPv6 by default to allow Neighbor Discovery (NS/NA) packets.

When you say ICMPv6 is allowed by default, does that mean when you have a pod isolated by an ingress Kubernetes NetworkPolicy, aws-network-policy-agent will allow ICMPv6 packets into the pod, without any special rules in the NetworkPolicy?

Why doesn't aws-network-policy-agent follow the same rule for IPv4 and ICMP(v4)?

Is it possible that this could be changed, so that the agent allows ICMP for IPv4? Or at least a setting exposed that allows changing it?

In our EKS cluster, we have pod issues that correspond to creating an ingress NetworkPolicy, and the aws-network-policy-agent logging ICMP DENY verdicts. The same pod activity is successful, and we see ICMP ACCEPT verdicts, if we either:

With ICMP denied:

{"level":"info","ts":"2023-11-17T16:16:13.490Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.1xx.xx.xx","Src Port":59050,"Dest IP":"10.1xx.xx.xy","Dest Port":443,"Proto":"TCP","Verdict":"ACCEPT"}
{"level":"info","ts":"2023-11-17T16:16:13.674Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"169.254.3.1","Src Port":0,"Dest IP":"10.1xx.xx.xx","Dest Port":0,"Proto":"ICMP","Verdict":"DENY"}

With ICMP accepted:

{"level":"info","ts":"2023-11-17T16:26:07.471Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.1xx.xx.xx","Src Port":56638,"Dest IP":"10.1xx.xx.xy","Dest Port":443,"Proto":"TCP","Verdict":"ACCEPT"}
{"level":"info","ts":"2023-11-17T16:26:07.675Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"169.254.3.1","Src Port":0,"Dest IP":"10.1xx.xx.xx","Dest Port":0,"Proto":"ICMP","Verdict":"ACCEPT"} 

We're not sure what the source IP 169.254.3.1 is, but we're thinking maybe that is the node's "gateway router" that's used internally by the VPC CNI?

Without the ICMP ACCEPT, the application's TCP connections out of the pod are established but at some point later are unable to receive further data. This is actually a pod running widely-used software that includes its own recommended NetworkPolicy - Argo CD's argocd-repo-server - and the TCP connections are for something pretty widespread as well - doing a "git fetch" on a remote repository. So it seems like this has the potential to broadly affect lots of users, assuming it's not just a quirk of our environment.

After trying to learn a bit more about ICMP, e.g. Understanding ICMP and why you shouldn't just block it outright, I see ICMP is much more than just the "ICMP ECHO" message used by ping, and can impact applications being able to use the network at all. This might explain e.g. why @atilsensalduz is seeing ICMP traffic even though it's not explicitly used by their app, and why our pod issues arise even though we don't explicitly use ICMP either.

It seems like, unless the agent gives the user the ability to add more fine-grained ICMP accept rules (like I think Calico and Antrea CNIs do), ICMP ought to be allowed by default, to insure that normal IPv4 network activity doesn't break.

The workaround, e.g. the following ingress NetworkPolicy, will work, but it's not as good as the agent giving a more-helpful default. The workaround is too broad (all protocols and ports), too fragile (depends on the correct ICMP source IP address), and counter-intuitive (you need to add this ingress policy specifically for pods that actually need egress / outbound connections):

   - from:
        - ipBlock:
            cidr: 169.254.3.1/32 

Thanks for your time and consideration!