oakestra / oakestra-net

Networking component of Oakestra
Apache License 2.0
5 stars 7 forks source link

Containerd and DNS on closed firewalls #178

Open smnzlnsk opened 1 week ago

smnzlnsk commented 1 week ago

Short

With the current configuration in Oakestra, DNS inside deployed containers only works, iff the container is able to successfully reach the in-container configured DNS server over the bridge and the host. If the host machine cannot reach the in-container configured DNS server because of a closed firewall policy, the container cannot resolve any hostnames, i.e. DNS is broken. This also means that the container completely disregards the host machine working DNS configuration, if there is one.

Proposal

Implement a rewriting of container originating DNS requests arriving at the oakestra bridge using iptables DNAT + SNAT to instead use the host machine DNS configuration. The goal is for DNS requests (all traffic arriving at the oakestra bridge with destination port 53) to instead be resolved by the host machines DNS configuration.

For example: Container DNS is 8.8.8.8, but host machine cannot contact 8.8.8.8 and instead uses local ISP DNS. Container queries 8.8.8.8:53, which defaults to the container bridge and arrives at the host machine firewall. Firewall uses DNAT to rewrite to 127.0.0.53:53, which usually is the systemd-resolver, and SNAT to rewrite source to be the host machine. DNS Query is then resolved like a usual DNS Query from the host machine to the local ISP DNS, and the results get re-translated back through the SNAT and DNAT to go back to the container, as if 8.8.8.8 responded. Whether the resulting connection to the chosen A-record works is not a concern as of now. We want working DNS, regardless of the configured DNS Server in the container image.

Ratio

This in general should make DNS work for containers regardless of configured DNS server, if the host machine has a working DNS configuration.

Impact

net-manager

Development time

one day

Status

in development

Checklist

Malyuk-A commented 1 week ago

As further proof - here are 3 tcpdumps that you can inspect via wireshark: (The zip contains 3 files (GitHub does not allow these files to be sent directly, but as zip it works) pings.zip

These are the important bits:

This is a ping with a DNS resolution on the host VM (everything works, the destination nameserver is the one specified on the host - normal behaviour) image

The image for docker and containerd are the same and their nameserver is 8.8.8.8 The same happens when the ping is run in a docker container - docker seems to map the 8.8.8.8 to the proper host nameserver. image

When run in the containerd container the original 8.8.8.8 nameserver is used and not found image

Malyuk-A commented 1 week ago

@smnzlnsk looked into this and found a working workaround! 🎉 (we tried plenty of other things with little success)

On the VM do the following:

sudo iptables -t nat -A PREROUTING -i goProxyBridge -p udp --dport 53 -j DNAT --to-destination 131.159.254.1:53

The --to-destination has to be a valid Nameserver IP, not the "loopback" one (i.e. 127.0.0.53) - this will not work because it will get lost in the loopback (like an infinite sink).

He wrote two scripts that set and unset this rule:

#!/bin/bash
# set.sh
sudo iptables -t nat -A PREROUTING -i goProxyBridge -p udp --dport 53 -j DNAT --to-destination 131.159.254.1:53
#!/bin/bash
# unset.sh
sudo iptables -t nat -D PREROUTING -i goProxyBridge -p udp --dport 53 -j DNAT --to-destination 131.159.254.1:53