freifunk-gluon / gluon

a modular framework for creating OpenWrt-based firmwares for wireless mesh nodes
https://gluon.readthedocs.io
Other
553 stars 325 forks source link

RFC: Provide CLAT on Routers (464XLAT) #1408

Open mweinelt opened 6 years ago

mweinelt commented 6 years ago

I started playing with NAT64/DNS64 a few days ago an it looks quite promising and is mostly pleasant to use, but support for 464XLAT, especially a CLAT implementation in Gluon, would go a long way in supporting such a setup.

What is NAT64/DNS64?

NAT64 (https://tools.ietf.org/html/rfc6146) does network address translation between IPv4 addresses and the well-known IPv4/IPv6 translation prefix 64:ff9b::/96. DNS64 (https://tools.ietf.org/html/rfc6147) is used to provide AAAA records with mapped addresses for records that only provide A records.

Clients only receive IPv6 Addressing and are provided a DNS-Server that does DNS64 translation.

Example

Problems

Two problems with that approach are widely known:

Hardcoded IPv4 literals

When protocols use hardcoded IPv4 addresses and does not rely on DNS resolution addresses cannot be mapped. As no IPv4 addressing exists, the IPv4 network is unreachable.

Rewriting DNSSEC-secured DNS records

When a domain name is secured by DNSSEC, but provides no IPv6 AAAA record, translation is required to make a connection. Problems arise at DNSSEC-validating resolvers, that will notice the tampering and fail the response, ultimately breaking connectivity.

What are the benefits of this setup?

By saving on the DHCP roundtrips and relying on the much quicker RS/RA mechanism we would very likely improve on the time until internet connectivity can be confirmed by the OS (like Android does).

What is 464XLAT?

Breaking DNSSEC is a dealbreaker, and to some users lack of direct IPv4 connectivity (think ping 8.8.8.8) might at least be awkward - enter 464XLAT.

As can be derived from its name 464XLAT (https://tools.ietf.org/html/rfc6877) provides a translation mechanism that allows IPv4 to IPv4 connectivity over an IPv6-only network. Gateways provide NAT64 (PLAT; provider-side translator) and nodes could provide the mechanism to translate from IPv4 to IPv6 (CLAT; consumer-side translator). Clients could also be in posession of a CLAT implementation, but we really cannot rely on that.

DNS64 would be replaced by the CLAT on nodes transparently translating packets between IPv4 and their well-known IPv6 mapped address. Nodes would need to provide local-only DHCP services allowing end devices the use of IPv4 sockets which would otherwise be impossible if consumers only received IPv6 addressing

The Jool website offers a far more detailed explanation than this: http://www.jool.mx/en/464xlat.html

Possible Implementations

Subset of https://en.wikipedia.org/wiki/IPv6_transition_mechanism#Implementations

PLAT (NAT64)

Running on gateways or adjacent routers.

Linux:

OpenBSD:

CLAT

Running on nodes

A-Kasper commented 6 years ago

I like this idea. It would also help to reduce meshwide v6 multicast traffic.

I'm not sure about NAT Solutions on client side. Do you have a clou about memory consumption?

christf commented 6 years ago

@CodeFetch is working on this already. You might want to get in touch with him on that one.

christf commented 6 years ago

Oh... and I am working on incorporating his approach into the current babel firmware together with ddhcp. I now have an image that contains necessary pieces even for the tiny target. Next step is to configure and run that stuff. Edit: There were talks of Vincent, Daniel and myself on WCW that described the status and relevant components for this approach.

CodeFetch commented 6 years ago

@mweinelt We already have simple CLAT and PLAT packages. Maybe you want to play around with them: https://github.com/freifunk-ffm/gluon/tree/christf_next/package/gluon-xlat464-clat https://github.com/freifunk-ffm/gluon/tree/christf_next/package/gluon-xlat464-plat

What I'm working on (NAT426) is superior to ordinary CLATs, because it is stateful and I can thus fake gateways and am ARP aware of my local neighbours so that we can allow client to client communication without assigning them different subnets etc. (NAT426 has two interfaces - one point-to-point IPv6 and one IPv4 ethernet device, which you put into a bridge with the client interface). NAT426 will optionally be able to reply to DHCP requests, if you don't need client to client communication, assigning clients the same IPv4 address and dynamically faking gateways in the 192.0.0.0/29 range (IPv4 Service Continuity Prefix).

The problem with Jool is, that it is broken in some way as it "breaks netfilters assumptions" and you have no possibility to configure it properly with OpenWrt in conjunction with NAT44 without network namespaces (which we don't have yet): https://github.com/NICMx/Jool/issues/140

The problem with Tayga is, that it is an obsolete userspace program. We have the same problem with context switches there like we have with fastd.

clatd is a perl script, OpenBSD is ISC licensed...

There is also NAT46 by ayourtch, which is being used by the gluon-xlat464-plat and -clat packages: https://github.com/ayourtch/nat46/tree/master/nat46/modules I built the plat package by combining the NAT46 with a NAT66 so I get a NAT64.

For NAT426 I used the OpenWrt SIIT module as inspiration: https://github.com/openwrt/packages/tree/master/net/siit/src

christf commented 6 years ago

@CodeFetch where can I find your code?

CodeFetch commented 6 years ago

@christf I'm going to release it, when it's working as expected and cleaned up. The complexity increased on a large scale with the consideration of DDHCP. Unfortunately my time is very limited at the moment as I have to finish a project which I need for a living and moved to another city recently.

rubo77 commented 6 years ago

I could imagine, that such a solution would destroy the IPv4 interconnectivity between clients...

Anyway: don't we have a running solution to the problem now with ddhcpd? and we don't need another workaround with NATing any more?

CodeFetch commented 6 years ago

@rubo77 I don't think so. NAT426 is not really a workaround. Babel is initially not designed to handle multiple IPv4 gateways behind a Babel node which don't speak Babel on their own. It works with IPv6 only because of having all nodes in the same subnet while with IPv4 we have different subnets practically. This draft https://tools.ietf.org/id/draft-ietf-homenet-babel-profile-06.html deals with some of these problems, but as far as I know the solutions are not implemented, yet. NAT426 does the following: It has a layer 2 IPv4 interface for clients and a layer 3 IPv6 interface for babel. It does all the ARP neighbour discovery on the L2 interface and simulates IPv4 gateways and other IPv4 clients as like they were connected on layer 2 with each other. Furthermore it will be able to do connection tracking (but that is not the initial goal). Thus you can configure it to simulate gateways with constant IPv4 addresses and in the background you can select another (presumable better) gateway for new TCP/UDP connections which is some kind of connection tracking which allows not only seamless roaming, but leases never running out from the client view while their preferred gateway changes. This is what some commercial solutions used by mobile communication systems do and allows us to exit IPv4 traffic locally and abolish some of the differences between nodes and supernodes. I'd love if babel could do something like that natively, but it is designed to only alter routes.

I try to explain it more clearly: We don't have global IPv4 subnets for clients/gateways ordinarily. Assume we have two IPv4 gateways. Without source routing we can't decide to which one the client traffic will be routed. Both gateways are being announced as a 0.0.0.0/0 route. This breaks the connection of the clients every time the metrics change. With source routing we could define routes for every client so that their traffic is being routed to the correct gateway, but we would end up with a whole lot of routes. NAT426 accomplishes that the traffic is being routed to the correct gateway and furthermore allows new connections to use another gateway when the metric changes while their old connections still use the old gateway.

rotanid commented 4 years ago

related pull request: #1808