projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
5.88k stars 1.31k forks source link

Support NAT64 for node egress #5043

Open johngmyers opened 2 years ago

johngmyers commented 2 years ago

Provide NAT64 for IPv6-only containers running on nodes that are dual-stack.

Expected Behavior

Calico should be configurable with an IPv6 CIDR, which must be a /96 or smaller. Traffic egressing to destination addresses that are contained in that CIDR should be stateful-NATed to the IPv4 address in the lower 32 bits of that destination address.

Current Behavior

As far as I am aware, there is currently no NAT64 support.

Possible Solution

This is similar to the masquerading for a Kubernetes overlay network, except the sense of the CIDR is inverted: for an overlay network everything outside the CIDR is NATed and for NAT64 everything inside the CIDR is NATed.

Additionally, the change in protocol families brings additional requirements over same-family NAT. These are documented in RFC 6145.

Context

The expected context is a Kubernetes cluster with an IPv6-only pod network and dual-stack host network. This configuration allows pods to egress to IPv4 destinations without having to assign the pods scarce IPv4 addresses.

Performing the NAT64 at the node level saves transit costs through an external NAT64 device when the pod needing egress is scheduled on a node with IPv4 connectivity to the destination. In the case where all such pods can be scheduled on such nodes, there would be no need to have an external NAT64 device in the first place.

Allowing the CIDR to be smaller than a /96 would be helpful in the case where the node has a private IPv4 address, for example 10.1.2.3. Setting a NAT64 prefix of, for example, 2001:db8::a00:0/104 would allow the node to NAT traffic to IPv4 destinations it can reach, yet pass to an external NAT64 device traffic to IPv4 destinations it cannot reach.

Your Environment

caseydavenport commented 2 years ago

Yep, this is something that has come up before as well IIRC. I think it's worth considering how we might implement this.