NICMx / Jool

SIIT and NAT64 for Linux
GNU General Public License v2.0
325 stars 66 forks source link

Add Device Driver mode #140

Open ydahhrk opened 9 years ago

ydahhrk commented 9 years ago

2018-11-25 Update

Hello. If you came here from the survey, you'll notice that this thread is rather large, has evolved and often wildly branches off-topic. So here's a quick summary for what Device Driver mode is:

Basically, Device Driver Jool will be an alternative to Netfilter Jool and iptables Jool. Your translator will look like a network interface (jool0 in the snippet below):

user@T:~$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether 1c:1b:0d:62:7a:42 brd ff:ff:ff:ff:ff:ff
3: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether 98:de:d0:80:b8:4d brd ff:ff:ff:ff:ff:ff
4: jool0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 64:64:64:64:64:64 brd ff:ff:ff:ff:ff:ff

It will behave similarly to loopback; it will look like an interface, but will in fact be a virtual one. An IPv6 packet routed towards it will be bounced back as an IPv4 packet, and vice-versa. You will send traffic to it by means of Linux's routing table rather than iptables rules.

The setup will probably very most intuitive for some people. The only drawback that I can think of is that, if you set it up on a translator meant to forward traffic, the machine will end up subtracting 3 (instead of 1) from the packet's TTL/Hop Limit field: One by Linux (when the packet is forwarded from eth0 to jool0), another one by Jool itself, and a last one by Linux again (when the packet is forwarded from jool0 to eth1).

And that's all, really. If that didn't already trigger chemistry in your brain, you probably don't need it.

Progress: Though I've tried to start this feature twice already, this work has been quickly obsoleted by a quickly evolving main branch. It's not practical to merge. I would have to start over from the beginning.


Original post

(As you will see, I still haven't finished writing this. I would, however, like this in the public domain in case someone has something interesting to say. I will come back and analyse this further once I've finished a lot of post-release and planning paperwork I need to flush from my desk.)

Being in the middle of Netfilter, we break Netfilter's assumptions.

As far as I can tell, the people who preceded me decided it would make sense for Jool to be a Netfilter/iptables module, because it's similar to NAT, and NAT is an iptables module.

Personally, I feel like we've hit a wall when it comes to pushing Netfilter's versatility, and we should find a way to more elegantly merge Jool with the kernel.

We seem to have the following options:

  1. Become a network (pseudo-)device driver (ie. look like an interface).
  2. Move over to userspace (follow Tayga's steps).
  3. Become an iptables module.
  4. Remain a Netfilter module and find workarounds for our compliance issues.

Both 1) and 2) appear to solve all of the following current annoyances:

  1. Filtering. Because doc from iptables discourages filtering on mangle, I'm renuent to ask users to do so (Even though I don't know what's the problem with mangle filtering, other than it looking somewhat counter-intuitive).
    Because Jool would look like an interface (1) or some userspace daemon (2), packets would not skip either the INPUT or the FORWARD chain, and therefore they would be filtered normally.
    This was already fixed using namespaces.
  2. Host-Based Edge Translation. 1) and 2) will naturally let the kernel know a route towards the RFC6052 prefix/EAM records/etc, so packets will survive ingress filtering.
    Currently, Jool cannot post a packet for local reception because it switches the layer-3 protocol of the packet. Linux goes "This is an IPv6 packet, but it came from an IPv4-only interface. Dropping."
    This can maybe currently be forced to work, but I don't think it's going to be pretty.
    This was already implemented using namespaces.
  3. --minMTU6. We can't ask the kernel to fragment to a particular size; ip_fragment() infers the MTU from the cached route, which is not --minMTU6-sensitive (though whether that's not better than --minMTU6 is still to be looked upon - another TODO).
    I decided to start deferring fragmentation to the kernel because the code is tricky to get right by ourselves and atrocious to learn and maintain.
    If we left Netfilter we would be free from the kernel's fragment representation and would be able to do it a lot easier.
    (though it would be best if the kernel exported a fragmentation function which received MTU as an argument, but that's not going to happen, particularly for old kernels.)
  4. Perhaps we would get rid of the need for two separate IPv4 addresses in stateful NAT64 mode. Not sure on this one; I need to think this more thoroughly - TODO pool4 port ranges fix this.

Less important but still worth mentioning:

  1. blacklist would be able to stop returning loopback and other evil addresses since, being far from pre-routing, Jool would naturally stop seeing these packets.

In my opinion, 1) is the most elegant option. This is because Host-Based Edge Translation forces the other options to include a dummy interface (so processes have an IPv4 address to snap themselves to). If an interface is necessary no matter the configuration, it would be cleanest if Jool itself "were" the interface.

Perhaps by adopting 2) we would attract new users who would not trust their kernels to us. On the other hand, it looks like a lot more work (I do not know to what extent is Jool married to kernel-only routines). It's also bound to make Jool somewhat slower, since packets need to be copied whenever they get in or out of kernelspace.

Other than perhaps get rid of the pools, I think there's not much to be earned from 3). Though we will look more like NAT, we will probably face roughly the same limitations as a Netfilter module (or perhaps more, since I'm not sure how NF_HOOK_THRESH() would behave when called from an iptables module).

3 and 4 sound like the most performance-friendly options (since there's less routing and no copying), and I feel like their symmetry with the kernel's NATting would make it the most elegant solution from the eyes of the kernel devs (which is important if we ever want to push Jool into Linux). I'm just wild guessing, though. Perhaps they want to keep Netfilter free of any more hacks and they'd prefer some of the other options better - TODO ask them.

Due to lack of experience, we're currently not aware of any roadblocks we might run into. More planning is necessary - TODO.

Criticism (on this post) and more ideas welcomed.

Omardyab commented 2 years ago

Would this work for MAP-E or DS-lite transition mechanism?

ydahhrk commented 2 years ago

@Omardyab No idea.

CodeFetch commented 2 years ago

@Omardyab From the client perspective it would. There is an out-of-tree device driver implementation of NAT46. https://github.com/ayourtch/nat46/blob/master/nat46/modules/README

ydahhrk commented 9 months ago

WIP: https://github.com/ydahhrk/joolif