genodelabs / genode

Genode OS Framework
https://genode.org/
Other
1.05k stars 248 forks source link

Virtual NAT #114

Closed nfeske closed 7 years ago

nfeske commented 12 years ago

For sharing one physical network interface among multiple applications, Genode comes with a component called nic_bridge, which implements proxy ARP. Through this component, each application receives a distinct (virtual) network interface that is visible to the real network. I.e., each application requests an IP address via a DHCP request at the local network. An alternative approach would be a component that implements NAT on Genode's NIC session interface. This way, the whole Genode system would use only one IP address visible to the local network. (by stacking multiple nat and nic_bridge components together, we could even form complex virtual networks inside a single Genode system)

The implementation of the virtual NAT could follow the lines of the existing nic_bridge component. For parsing network packets, there are already some handy utilities available (at os/include/net/).

blitz commented 12 years ago

Please don't break the Internet. :) NAT is evil. If there is a good usecase for sharing an IP, then the IMHO better way would be a service where an application can register itself to receive certain traffic (e.g. proto=TCP, srcport=1234, dstport=80 for an HTTP connection).

skalk commented 12 years ago

Well, as long as we live in environments with an IPv4 LAN-subnet IP addresses get quickly exhausted, when each applications owns its own IP-address. But I agree, when doing recursive NATing things run out of control. And mostly we're sitting behind a NAT-router already.

I think your proposal sounds feasible. Of course, such a component has to implement some general stuff like: providing to the client-applications so they can be left untouched (still using their own TCP/IP stack), acting as DHCP-client to the outer network, responding to ARP requests, and potentially IMCP. Alternatively the policy of that component describes which client is responsible for these lower-level protocols. I think the policy you've already described (e.g. dstport=...) is needed regardless of whether full NAT or such kind of "lightweight NAT" is implemented. At least as long as incoming connections are desired. Otherwise, the NAT-component wouldn't know where to forward incoming packets to (e.g. a web-server running as a client app).

blitz commented 12 years ago

TCP/IP stacks need to use unique srcports. So you can either

ICMP handling is split between TCP/IP stack and service. Regarding ARP: Ideally the applications are unaware of the L1 protocol (be it 802.3/Ethernet, 802.11, ...). This would imply changing the TCP/IP stack as well.

There are some plans for a multiserver-OS friendly TCP/IP stack in the back of my head. But this won't happen this year in any case.

skalk commented 12 years ago

I think trusting the client in choosing the right src-ports is no option. At least one has to inspect whether the client use the right src-ports. Moreover, I don't think translating the src-ports transparently can be compared with the complexity of a fully featured NAT implementation (with handling different MTUs, IP-fragmentation, etc.).

With respect to your thoughts of changing the TCP/IP stack in general: I think it would be desirable to use it on top of the nic-bridge (ARP-Proxy) or lwNAT (my projectname for: lightweight NAT) without the need of linking it with different TCP/IP libraries. My idea is something like the following:

There might be some corner-cases not in my eyes right now, but I think it's feasible to implement such a component without getting a much more complex component than the already existing ARP-proxy aka nic-bridge. Some very simple parsing routines for the mentioned network-layers already exist in the nic-bridge, and shall by sufficient.

Although, I am champing at the bit to put lwNAT into practice, my priorities are different right now. But I think this is some cool self-contained, and concise project for people, that want to get in touch with Genode. @blitz: if nobody cares about it till then, we'll meet in a year and make a multiserver-OS network hack competition ;)

blitz commented 12 years ago

@skalk I still find the Genode Hiking week/weekend somewhere in the mountains a good idea. Hiking, hacking, relaxing. There would be lots of time to discuss stuff like this. ;-)

ghost commented 12 years ago

A hiking and hacking weekend in the mountains would be awesome! I loved the idea when we talked about it at Grand Place in Bryssels :)

idgy commented 12 years ago

I want to like this thread =)

chelmuth commented 12 years ago

+1 from me too ;-)

Logout22 commented 9 years ago

My diploma thesis treated this topic (as I already mentioned on the list). chelmuth asked me to link my source code here, so here goes nothing: https://github.com/Logout22/buildrump.sh

I reckon the project will need a complete rewrite to fulfil Genode standards, but maybe one idea or another from the code is useful on the way there.

Cheers!

NobodyIII commented 9 years ago

What's the status on this? It would be good to see this implemented.

nfeske commented 9 years ago

As far as I know, no one has started any development yet.

Logout22 commented 9 years ago

NobodyIII: You are free to go, I am happy to provide counselling. Regarding my personal schedule I am still months away from even starting work, sorry about that.

m-stein commented 8 years ago

I'm currently working on a virtual NAT component. My first goal is to do simple port forwarding. Therefore, I created a scenario with a NIC bridge that has two clients, an HTTP client and my NAT component. The NAT has one client, an HTTP server. so HTTP client and NAT are in the "public" net and the HTTP server is in a "private" subnet, reachable only through the NAT IP + port 80. I'm building the NAT based on a copy of the NIC bridge implementation.

chelmuth commented 8 years ago

I like your approach as it needs no "outer" networking configuration and runs just on Genode. But, please be aware that nic_bridge is an aged component and needs significant reworking at various places - mind the current issue reported on the mailing list. So, don't let the current implementation guide you into false directions ;-)

m-stein commented 8 years ago

@chelmuth Thank you for the hint.

m-stein commented 8 years ago

The NAT component now reads routing rules from its config and forwards TCP packets from the public to the private net accordingly. The HTTP server in the private net answers with an ARP request for the HTTP client. This is the next issue to solve.

m-stein commented 8 years ago

My small HTTP test works with the NAT component doing port forwarding on public requests and source NAT on replies from the subnet. I've also cleaned up the code a little bit and removed the redundant declaration of Mac_adress in include/net and include/nic_session respectively. Next issues:

skalk commented 8 years ago

@m-stein: did you tested the nic_bridge once again after removing the Mac_address? Originally the Mac_address in nic_session represented the byte array only, whereby the extended version in net/netaddress.h defines copy constructors, assignment and equality operators, and as a template does not only work for MAC addresses, but IPv* addresses too.

m-stein commented 8 years ago

As I'm using the nic_bridge component in my NAT test, yes, but test-nicbridge_static also still works. By now, I found no technical reason for not using the Net MAC address in the NIC session. But maybe I've missed a thing?

m-stein commented 8 years ago

I've merged the back-end code of the uplink and the other interfaces (Session Components) in the NAT as far as possible. Most notably, the configuration interface, the routing decisions, and the packet modification are now generic over all interfaces.

From its config, the NAT component now takes and applies an IP routing table in the following form:

<route ip_addr="0.0.0.0"      netmask="0.0.0.0"       interface="uplink"     gateway="10.0.5.1" />
<route ip_addr="10.0.2.0"     netmask="255.255.255.0" interface="uplink"     gateway="0.0.0.0"  />
<route ip_addr="192.168.1.18" netmask="255.255.0.0"   interface="http_srv_1" gateway="0.0.0.0"  />

If the gateway attribute is not given (0.0.0.0) an affected packet is send directly to its destination IP.

The available interfaces are configured as follows:

<policy label="uplink"      proxy="0"                 nat_ip_addr="10.0.2.55"   ip_addr="192.168.1.72" />
<policy label="http_srv_1"  proxy="1" proxy_ports="0" nat_ip_addr="192.168.1.1" ip_addr="192.168.1.18" />

The interface name "uplink" is reserved for the only session client that the NAT creates. The other interface names refer to the NIC session labels.

The NAT needs a proxy_ports attribute for each interface with proxy="1". For outgoing traffic from an interface with proxy="1", the NAT component does source NAT. The number of proxy source ports that an interface can occupy at once is limited through its proxy_ports attribute to avoid denial of service attacks. All ports configured for port forwarding get removed from the proxy port allocator to avoid clashes.

Furthermore, each interface has a nat_ip_addr attribute that tells the NAT which identity to use when talking to the interface.

m-stein commented 8 years ago

The NAT keeps track of proxified links through objects that occupy resources at the guarded RAM allocator of the corresponding NIC session component as well as the NAT port allocator.

These proxy-link objects now get destructed as soon as they are not needed anymore. For proxified TCP, the NAT keeps track of FIN packets and corresponding ACK packets. As soon has both sides of a the link have sent a FIN and got ACKed, the NAT starts a timeout of two times the round trip time after which the proxy-link object gets removed. The round trip time is currently set statically through the NAT config.

skalk commented 8 years ago

IMO it would be preferable to define routing rules per session and not globally. For example <route>...</route> rules might be subentries of <policy></policy> tags. Currently, I work on a TOR scenario, where I have a TOR component and a VM sharing an IP subnet, whereby the VM shall only connect to the TOR component, but the TOR component can also send to the outer world ("uplink" of the NAT component). With above configuration it is not possible to isolate the VM accordingly, or otherwise the NAT component wouldn't able to access the uplink. A component-wise routing policy would allow to build such scenarios. Alternatively, if we fear to get to much routing rules that are only duplicates, we could introduce default-route analog to the init configuration.

I've some additional remarks regarding the configuration definitions:

<policy label="uplink">
  <route dst="192.168.1.18/16" src="192.168.1.1" label="http_srv"  />
</policy>

<policy label="http_srv" ports="0">
  <route dst="10.0.2.0/24" src="10.0.2.55" label="uplink"  />
  <route dst="0.0.0.0/0" src="10.0.2.55" gateway="10.0.5.1" label="uplink" />
</policy>
m-stein commented 8 years ago

I've moved the routing rules into the policy tags, made the gateway attribute optional and replaced ip_addr/netmask by dst="x.x.x.x/x". Instead of proxy and proxy_ports attribute there is now only the proxy attribute. If set, the source ports of the interface are proxified (except for packets whose source port is a "port forwarding" port). The value of the proxy attribute is the maximum number of NAT ports the interface may occupy.

skalk commented 8 years ago

@m-stein great, thank you for adapting the configuration!

m-stein commented 8 years ago

A NAT config now may look as follows

            <policy label="uplink" src="10.0.2.55" ip_addr="192.168.1.72" port="8080">
                <route dst="192.168.1.18/32" label="http_srv_1" />
                <route dst="192.168.1.72/32" label="http_srv_2" />
                <route dst="10.0.2.55/32">
                    <tcp-port nr="80"   label="http_srv_1"/>
                    <tcp-port nr="8080" label="http_srv_2"/>
                    <tcp-port nr="2345" label="http_clnt_3"/>
                    <udp-port nr="80"   label="http_srv_1"/>
                </route>
            </policy>
            <policy label="http_srv_1" src="192.168.1.1" proxy="3" ip_addr="192.168.1.18">
                <route dst="10.0.0.0/19"   label="uplink" gateway="10.0.6.1" />
                <route dst="10.0.2.128/25" label="uplink" gateway="10.0.3.1" />
                <route dst="10.0.2.0/24"   label="uplink">
                    <udp-port nr="1234" label="http_clnt_3"/>
                </route>
            </policy>
            <policy label="http_srv_2" src="192.168.1.1" proxy="3" ip_addr="192.168.1.72">
                <route dst="10.0.2.0/24" label="uplink" />
            </policy>
            <policy label="http_clnt_3" src="100.200.0.1" proxy="3" ip_addr="100.200.0.128">
                <route dst="10.0.6.0/23" label="uplink" gateway="10.0.4.1" />
                <route dst="10.0.2.0/24" label="uplink" />
            </policy>

The label attribute in the route tag is optional. If it is not given and the route tag has no port subtags, the route has no effect. If a route has port subtags and a label attribute, the label of the route tag is the default destination for all ports that don't match a port subtag. If the route tag has no label attribute but port subtags, only the given ports get forwarded.

The next goal is, to handle ARP broadcasts correctly according to the routing rules.

m-stein commented 8 years ago

I was able to remove some bugs that occured sporadically, removed some deprecated stuff like the policy attribute ip_addr, and rebased to staging. Next, I'll extend the tests to UDP transfers and try to enable a netperf-nat test.

m-stein commented 8 years ago

The NAT can now be used as UDP proxy. Therefore, it stores a UDP-proxy-state for two times the round trip time. This way, client and server can hold a "UDP connection" open by frequently sending keep-alive messages.

m-stein commented 8 years ago

The next step is to provide two attributes "via" and "to" in the ip routes and port routes (currently there's only a "via"). The "via" attribute tells the NAT which IP address to use when asking for the destination MAC address for a matching packet (ARP), whereas the "to" attribute defines the destination IP address that is directly installed at matching packets.

skalk commented 8 years ago

@m-stein: the commit 13df330 that changes the MAC address class at least breaks the dde-ipxe NIC driver. It does not compile anymore.

skalk commented 8 years ago

@m-stein: fixup for above problem (only looked at dde_ipxe!!) is here: 6b00c9bcce2c6c6da691d53f7bad4bc6e4a872c0