rootless-containers / slirp4netns

User-mode networking for unprivileged network namespaces
GNU General Public License v2.0
729 stars 82 forks source link

Outbound IP seems to be broken #267

Closed Diniboy1123 closed 3 years ago

Diniboy1123 commented 3 years ago

Hi,

I have discovered the following issue while using podman, therefore the original issue report with the details can be found here: https://github.com/containers/podman/issues/10463

However to narrow down the issue I have decided to directly try outbound IPs with slirp4netns. I have a wg0 interface with the 10.65.203.10 IP address which I would like to use as slirp4netns' outbound IP. So I have followed the README.md and it seems to get up fine:

$ slirp4netns --configure --mtu=1420 --disable-host-loopback $(cat /tmp/pid) --outbound-addr=10.65.203.10 tap0
WARNING: Support for --outbount-addr is experimental
sent tapfd=5 for tap0
received tapfd=5
Starting slirp
* MTU:             1420
* Network:         10.0.2.0
* Netmask:         255.255.255.0
* Gateway:         10.0.2.2
* DNS:             10.0.2.3
* Recommended IP:  10.0.2.100
* Outbound IPv4:    10.65.203.10

In another terminal I see this:

# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: tap0: <BROADCAST,UP,LOWER_UP> mtu 1420 qdisc fq_codel state UNKNOWN group default qlen 1000
    link/ether 1e:c8:eb:ae:9e:a8 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.100/24 brd 10.0.2.255 scope global tap0
       valid_lft forever preferred_lft forever
    inet6 fe80::1cc8:ebff:feae:9ea8/64 scope link
       valid_lft forever preferred_lft forever

But curl 1.1.1.1 or anything fails within the "netns". I cannot even resolve dns records using dig. Everything times out. If I do a tcpdump on wg0 meanwhile I see no traces that slirp4netns or anything tried to send packets even.

Without the outbound-addr flag everything works fine though.

I am on

slirp4netns version 1.1.8+dev
commit: 6dc0186e020232ae1a6fcc1f7afbc3ea02fd3876
libslirp: 4.4.0
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.0

Any clue what could I do?

Thank you for the amazing software though!

Diniboy1123 commented 3 years ago

I have cloned the repo and tried to run the outbound addr test which didn't fail. I did some experimenting and if I run slirp4netns from a user whose default route is set to the outbound address of my choice:

# ip rule list
32765:  from all uidrange 981-981 lookup vpn
# ip route list table vpn
default dev wg0 scope link

So basically everything running under the user 981 will be routed to the wg0 interface. And then if I run slirp4netns with the outbound IP address of the wg0 interface, it works. So maybe its a routing issue? And the tests don't fail, because I have routes to 127.0.0.1 and the external IP of the server which the script tries to set?

5eraph commented 3 years ago

Hi @Diniboy1123 , I will have to check and reproduce. But so far it seems strange that you have to add default route for the user. Outbound addr works by binding outbound sockets to specified interface/IP.

I wonder did you try to run without custom ip route and with outbound address set to IP instead of interface name?

Diniboy1123 commented 3 years ago

I wonder did you try to run without custom ip route and with outbound address set to IP instead of interface name?

Yes, originally I didn't have any custom ip routes. The IP I provided is really belonging to the interface of my choice. And I tried to specify both interface name and IP as well. None of these worked.

Also, I just noticed there is a typo here: WARNING: Support for --outbount-addr is experimental

5eraph commented 3 years ago

wg0 is this wireguard interface? Or physical with custom naming policy?

Diniboy1123 commented 3 years ago

It's a wireguard interface.

5eraph commented 3 years ago

Hmm I would not be surprised if wireguard routing plays a role. (Still did not have time to test... I plan to do that later today)

Diniboy1123 commented 3 years ago

Could be. I am using wg-quick to bring up the interface and using Table = off so technically it shouldn't insert any default routes or anything, but not sure.

5eraph commented 3 years ago

Hi. It took a little bit longer. But I managed to test outbound addr on VPS with 2 public ipv4. All seems to work as expected.

I do not have experience with wireguard so no advice for you rn.

But I think you are right - this is routing issue. But not one related to outbound addr. Instead it is related to your setup.

To explain - libslirp creates tap interface within container and all traffic is routed virtually within slirp4netns process. All outbound connections are open sockets (in case of tcp, not sure about udp). Outbound addr is specified by binding created sockets to specified IP instead of to default one. There are no additional routes or iptables configured to achieve this.

And I am not sure whether there something we could do with this. Because how we determine which route to use? This is out of scope of container.

Anyway did you try curl --interface ... from same user you run container under and with your wg interface?

Diniboy1123 commented 3 years ago
Anyway did you try curl --interface ... from same user you run container under and with your wg interface?

Yes and it works that way just fine. Ping too if I specify the interface to use. In the original issue at podman, I even copy-pasted the output: https://github.com/containers/podman/issues/10463

5eraph commented 3 years ago

Do you run with custom uid range? (#272) if so please try to run with default.

Can you prepare script which would setup your scenario so I would be able to reproduce? I do not think I will be able to tell whats going without getting hands on.

Diniboy1123 commented 3 years ago

Alright, so just to make sure I decided to test on a fresh fedora installation. I have decided to use cloudflare warp as my wireguard endpoint since its free so its easier to reproduce my environment: https://github.com/ViRb3/wgcf

[root@fedora ~]# wg-quick up /home/user/wgcf-profile.conf
[#] ip link add wgcf-profile type wireguard
[#] wg setconf wgcf-profile /dev/fd/63
[#] ip -4 address add 172.16.0.2/32 dev wgcf-profile
[#] ip -6 address add fd01:5ca1:ab1e:8d6b:61a:e82c:e54e:b2c0/128 dev wgcf-profile
[#] ip link set mtu 1280 up dev wgcf-profile

Then I did what was suggested in the readme of the project:

[user@fedora ~]$ slirp4netns --configure --mtu=65520 --disable-host-loopback $(cat /tmp/pid) --outbound-addr=wgcf-profile tap0
WARNING: Support for --outbount-addr is experimental
sent tapfd=5 for tap0
received tapfd=5
Starting slirp
* MTU:             65520
* Network:         10.0.2.0
* Netmask:         255.255.255.0
* Gateway:         10.0.2.2
* DNS:             10.0.2.3
* Recommended IP:  10.0.2.100
* Outbound IPv4:    172.16.0.2
[root@fedora ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
3: tap0: <BROADCAST,UP,LOWER_UP> mtu 65520 qdisc fq_codel state UNKNOWN group default qlen 1000
    link/ether fe:4e:8b:24:89:5a brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.100/24 brd 10.0.2.255 scope global tap0
       valid_lft forever preferred_lft forever
    inet6 fe80::fc4e:8bff:fe24:895a/64 scope link
       valid_lft forever preferred_lft forever
[root@fedora ~]# curl --interface tap0 1.1.1.1
^C
[root@fedora ~]# curl 1.1.1.1
^C
[root@fedora ~]# ip route list
default via 10.0.2.2 dev tap0
10.0.2.0/24 dev tap0 proto kernel scope link src 10.0.2.100
[root@fedora ~]#

It all fails seemingly. However curl works:

[user@fedora ~]$ curl --interface wgcf-profile -I 1.1.1.1
HTTP/1.1 301 Moved Permanently
Date: Thu, 05 Aug 2021 11:14:20 GMT
Content-Type: text/html
Content-Length: 167
Connection: keep-alive
Location: https://1.1.1.1/
Server: cloudflare
CF-RAY: 679f99edbd2e4dd0-FRA

[user@fedora ~]$ ping -I wgcf-profile 1.1.1.1
PING 1.1.1.1 (1.1.1.1) from 172.16.0.2 wgcf-profile: 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=64 time=1.02 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=64 time=1.14 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=64 time=1.08 ms
^C
--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 1.022/1.080/1.139/0.047 ms
5eraph commented 3 years ago

I should be able to test on weekend. I will let you know how it went.

Diniboy1123 commented 3 years ago

Appreciate your time and efforts.

ValHeimer commented 3 years ago

Do you run with custom uid range? (#272) if so please try to run with default.

I just added a comment to my issue https://github.com/rootless-containers/slirp4netns/issues/272#issuecomment-894352811 It was a mistake in my sysctl config, sorry...

5eraph commented 3 years ago

Hi @Diniboy1123, I just tested the setup on Ubuntu. I do not believe this is slirp4netns or outbound_addr related. The thing is the wg uses for redirecting traffic default route and packet marking. In my case if I set allowed IPs to 0.0.0.0/0 everything passed through wg and nothing through default interfaces even though I was bound with outbound_addr to native interface. If I allowed IPs to something else I ended in packets being routet through default interfaces but not passed through wg. I did not manage to find way how to pass traffic from wg interface through it exclusively. (I never used wg before tbh)

Anyway I am not sure what are you trying achieve. But I am not sure whether this is right way. I would expect to spawn container/netns and setup wg interface inside and launch anything you want within that ns.

So if you tried to run container which would join wg interface you would instead launch wg inside that container. Although I can see the overhead of such operation probably wont be neglectable. So in multi container scenario it may be worth to run one central wg. But unfortunately for now I am out of ideas how to help you further.

Diniboy1123 commented 3 years ago

Appreciate it that you checked it out. As I mentioned earlier I set the Table = off parameter within the conf. That makes it so that no default routes are added.

If you scroll back to my previous comment, you can see that these are the commands executed by wg-quick if you set Table = off.

[root@fedora ~]# wg-quick up /home/user/wgcf-profile.conf [#] ip link add wgcf-profile type wireguard [#] wg setconf wgcf-profile /dev/fd/63 [#] ip -4 address add 172.16.0.2/32 dev wgcf-profile [#] ip -6 address add fd01:5ca1:ab1e:8d6b:61a:e82c:e54e:b2c0/128 dev wgcf-profile [#] ip link set mtu 1280 up dev wgcf-profile

Basically it brings up a new interface, sets the wg configuration to it, then the ip addresses and the mtu. Nothing else should be set, no routing rules or anything. My ip route show doesn't contain any routes added by wg0 once I bring up a wg0 interface after this nor I see any iptables rules that would be used to mark packets.

Then I can just curl anything with the --interface option or ping or do whatever I want, most softwares that support outbound addr setting work out of the box with it. Only slirp4netns doesn't.

My use case is pretty simple. I own a dedicated server that's hosted on a network that has a really bad peering to my home ISP for example. I decided to pay for a VPN and set it up on the host so I can route all the container traffic there. Then I can bind all my rootless containers to this interface so they will be exposed thru the vpn and therefore will have better peering. Sadly slirp4netns doesn't seem to work. I would prefer to not to set wg up within a container as you suggested as that wouldn't work with a rootless container. You need to have it privileged at the very least. So why not bringing up the wg0 interface on the host as root then and giving it to each and every container individually then.

Right now I could workaround this so I can have wg0 on the host and the containers will use wg0 as the default route that I created a new routing table, created a new user called vpn-containers and used the uidrange for that user on the table. And within the table I specified wg0 as the default route. So basically if I spawn a process that's owner by "vpn-containers" it will automatically go thru wg0 as this is their default route while on the other users the default route remains the same like before. Works just like I imagined though this solution requires a routing table and an extra user so I cannot put it into my traditional "containers" user's pods.

5eraph commented 3 years ago

Ah right my bad. I tested with Table = off. But my traffic is still forced through default interface. Even though connection is bound to IP assigned to wg interface. So now I can replicate ... but not sure what to do with this yet 😕

5eraph commented 3 years ago

I will check packets coming from slirp. Maybe there are some additional clues.

5eraph commented 3 years ago

Ok this is interesting. Packets coming from slirp have correct IP in the header. But they wont get to wg interface. Rather they are dropped to native/primary interface. If I test with outbound-addr bound to native interface (multiple IP on primary interface) it works as expect. Same if I set outbound to lo - packets appear on lo interface. So now "just" figure out how did get these packets bound to wq addr on primary interface 🤔

Actually no. Anything targeting local IPs goes through lo. So apparently outbound-addr works only if IP addresses are on the route...

Diniboy1123 commented 3 years ago

Nice catch. Do you have any idea how could we resolve this?

5eraph commented 3 years ago

Yea this is definitely routing. Kernel forwards packets to follow route. Even though we have wg IP we are still trying to reach something outside so we fall into default route. I think solution would be on wg side to fwmark packagets coming from wg subnet and route marked packets through wg. Heh I never tried rule based routing so not sure yet how to solve it... but it should be fairly similar to what wg by default:

[#] ip -4 route add 0.0.0.0/0 dev rq table 51820
[#] ip -4 rule add not fwmark 51820 table 51820 # <-- I think you need this rule to mark only packets coming from `wg` subnet range
[#] ip -4 rule add table main suppress_prefixlength
5eraph commented 3 years ago

Ok found it. My wg interface has IP: 10.11.0.157/24 So I added rules:

ip rule add from 10.11.0.0/24 table 20
ip -4 route add default via 10.11.0.157 dev rq table 20

This way if I set outbound-addr to wg it goes through it as expected.

5eraph commented 3 years ago

So this will do the trick in wg.conf

[Interface]
...
Address = 10.11.0.157/24
Table=off
PostUp=ip -4 route add default via 10.11.0.157 dev rq table 20; ip rule add from 10.11.0.0/24 table 20
PostDown=ip -4 route del default via 10.11.0.157 dev rq table 20; ip rule del from 10.11.0.0/24 table 20
...

NOTE: rq is name of wg interface.

And now it should be safe to close this issue 🙂

Diniboy1123 commented 3 years ago

Thank you so much for your time and efforts. Sounds great! Will see if I can get it to work, if not I will just reopen the issue

Diniboy1123 commented 3 years ago

You were right! The routes were wrong. A simple ip route add 172.16.0.2/32 dev wgcf-profile did the trick. Thank you so much!