peacey / split-vpn

A split tunnel VPN script for Unifi OS routers (UDM, UXG, UDR) with policy based routing.
GNU General Public License v3.0
817 stars 56 forks source link

Enable split script to use Magic SD-WAN #194

Open sidprax opened 1 year ago

sidprax commented 1 year ago

Is there a way to use existing options (wireguard). I cannot get internet traffic to go through the Magic S2S tunnel (both side subnets can talk to each other), but I'd like to access internet on site 1 through the WAN on site 2.

sidprax commented 1 year ago

I tried a few different ways (outputs below are from nexthop from Site B). 10.0.1.0/24 is Site A, 10.0.10.0/24 is B. Magic S2S uses 192.168.X.0 as gateway. wgsts1000 is the interface name for wg Magic S2S.

Seems like something needs to be done on perhaps remote site B firewall, but I'm hitting a wall here. @peacey Any help will be much appreciated!

Pinging a remote client on B With and without masquerade works-

tcpdump -ni any host 10.0.10.64

tcpdump: data link type LINUX_SLL2 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 23:43:12.678197 wgsts1000 In IP 10.0.1.75 > 10.0.10.64: ICMP echo request, id 1, seq 1814, length 40 23:43:12.678243 br0 Out IP 10.0.1.75 > 10.0.10.64: ICMP echo request, id 1, seq 1814, length 40 23:43:12.678271 rai4 Out IP 10.0.1.75 > 10.0.10.64: ICMP echo request, id 1, seq 1814, length 40 23:43:12.758358 rai4 P IP 10.0.10.64 > 10.0.1.75: ICMP echo reply, id 1, seq 1814, length 40 23:43:12.758358 br0 In IP 10.0.10.64 > 10.0.1.75: ICMP echo reply, id 1, seq 1814, length 40 23:43:12.758443 wgsts1000 Out IP 10.0.10.64 > 10.0.1.75: ICMP echo reply, id 1, seq 1814, length 40

Without masquerade, pinging an internet IP seems to go through the 192.168.X.X subnet and seems to be dropped. (Pinged 4.2.2.2 from Site A)

tcpdump -ni any host 4.2.2.2

tcpdump: data link type LINUX_SLL2 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 23:02:51.683089 wgsts1000 In IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1392, length 40 23:02:51.683175 wgsts1000 Out IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1392, length 40 23:02:56.566520 wgsts1000 In IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1393, length 40 23:02:56.566587 wgsts1000 Out IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1393, length 40

With Masquerade Doesn't work, it seems like packets keep getting reflected multiple times? (Pinged 4.2.2.2 from Site A)

tcpdump -ni any host 4.2.2.2

tcpdump: data link type LINUX_SLL2 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 23:51:41.363282 wgsts1000 In IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40 23:51:41.363333 wgsts1000 Out IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40 23:51:41.640179 wgsts1000 In IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40 23:51:41.640199 wgsts1000 Out IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40 23:51:41.913949 wgsts1000 In IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40 23:51:41.913970 wgsts1000 Out IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40 .... Several same rows

peacey commented 1 year ago

Hi @sidprax,

It's difficult for me to debug with because I don't have two UDMs to try the magic site-to-site on. From the results you show, it's odd that when pinging an external IP (4.2.2.2) with or without masquerate, the request on Site B are being re-routed back through the wireguard tunnel instead of the WAN tunnel.

23:02:51.683089 wgsts1000 In IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1392, length 40
23:02:51.683175 wgsts1000 Out IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1392, length 40

It says In IP then Out IP out of the same wgsts1000 tunnel... that shouldn't be the case. It should say Out IP out of the WAN tunnel for it to work. So I'm suspecting some weird rules that Unifi has for this tunnel that forces all traffic out of it to go back through it, perhaps...? Or maybe you're using an incorrect gateway.

First of all, you said you are using 192.168.X.0 as the gateway in the VPN script? .0 isn't a usable IP though, it's the broadcast address and isn't assigned to any host. Did you mean 192.168.X.1 or something like that? And how did you figure out this gateway?

Also, how are you adding a wireguard magic S2S? When I go to S2S options in Unifi, I only see OpenVPN or IPSec options.

sidprax commented 1 year ago

Thanks for replying! See below for some outputs from Site A which may be helpful. Let me know if you want to see some outputs from Site B instead.

I see 192.168.X.0 here: netstat -r

Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 10.0.0.2 0.0.0.0 255.255.255.255 UH 0 0 0 br5.mac 10.0.1.0 0.0.0.0 255.255.255.0 U 0 0 0 br0 10.0.10.0 192.168.0.0 255.255.255.0 UG 0 0 0 wgsts1000 10.0.10.2 192.168.0.0 255.255.255.255 UGH 0 0 0 wgsts1000 XXXXXXXXX 0.0.0.0 255.255.255.0 U 0 0 0 eth8 192.168.0.0 0.0.0.0 255.255.255.255 UH 0 0 0 wgsts1000

and here: ip route show 10.0.10.1/24

10.0.10.0/24 via 192.168.0.0 dev wgsts1000 proto ospf metric 20 onlink

Wireguard has 0.0.0.0/0 but when using your script, I disabled blackhole. wg

interface: wgsts1000 public key: XXXXXX private key: (hidden) listening port: 20001

peer: XXXXXXX endpoint: XXXXXXXX:22570 allowed ips: 0.0.0.0/0, 192.168.0.0/32 latest handshake: 38 seconds ago latest receive: 3 seconds ago transfer: 1.85 MiB received, 1.47 MiB sent persistent keepalive: every 10 seconds forced handshake: every 5 seconds

The Magic S2S is a new option from firmware 3.1.X when you own multiple unifi OS devices. You can choose the option in unifi dashboard.

image

image

sidprax commented 1 year ago

Unifi changes the X in 192.168.X.0 when it reconnects, so don't mind that changing from 5 to 0 in last reply.

jeffdoo commented 1 year ago

I too am having issues while attempting to setup a new remote UDM Pro for my inlaws.

I do not know if this makes a difference but using the only S2S IPSec method I would see the following:

ip route show 192.168.10.0/24
192.168.10.0/24 dev vti64 proto static scope link metric 30

But when running the same command when using the new Magic method I see the following:

ip route show 192.168.10.0/24
192.168.10.0/24 via 192.168.1.1 dev wgsts1001 proto ospf metric 20 onlink

If I configure vpn.conf to use 192.168.10.1 (Site A) as I am used to with the old S2S IPSec VPN:

/etc/split-vpn/vpn/updown.sh wgsts1001 up huntersville
[Mon Jul 31 08:41:02 EDT 2023] split-vpn: wgsts1001 up: Loading configuration from /mnt/data/split-vpn/nexthop/huntersville/vpn.conf.
Error: Nexthop has invalid gateway.

If I use 192.168.1.1 everything starts fine but nothing is routed to the internet.

sidprax commented 1 year ago

I too am having issues while attempting to setup a new remote UDM Pro for my inlaws.

I do not know if this makes a difference but using the only S2S IPSec method I would see the following:

ip route show 192.168.10.0/24
192.168.10.0/24 dev vti64 proto static scope link metric 30

But when running the same command when using the new Magic method I see the following:

ip route show 192.168.10.0/24
192.168.10.0/24 via 192.168.1.1 dev wgsts1001 proto ospf metric 20 onlink

If I configure vpn.conf to use 192.168.10.1 (Site A) as I am used to with the old S2S IPSec VPN:

/etc/split-vpn/vpn/updown.sh wgsts1001 up huntersville
[Mon Jul 31 08:41:02 EDT 2023] split-vpn: wgsts1001 up: Loading configuration from /mnt/data/split-vpn/nexthop/huntersville/vpn.conf.
Error: Nexthop has invalid gateway.

If I use 192.168.1.1 everything starts fine but nothing is routed to the internet.

I think the wireguard implementation is actually great because I'm pretty sure there's some wizardry happening in the back end for CG-NAT. I never got openvpn or IPSec S2S to work for me in the past. The wireguard implementation is working pretty well for connecting to clients on Site B, but I think some rules (either by unifi's design or omission) are blocking external bound traffic.

We need a networking wizard to help here 😃 @peacey whenever you have some time!

sidprax commented 1 year ago

@peacey @jeffdoo Any chance you were able to look at this?

jeffdoo commented 1 year ago

@sidprax I did not have time to further investigate and went back to the old IPSec S2S solution. Hopefully this can be resolved because the Magic method makes connecting UDM Pros incredibly easy.

sidprax commented 1 year ago

This is weird, I'm not sure why my ip route shows a via .0 (broadcast address) while yours @jeffdoo shows via .1 😑

ip route show 10.0.10.0/24
10.0.10.0/24 via 192.168.5.0 dev wgsts1000 proto ospf metric 20 onlink
angusdavis2 commented 1 year ago

Magic Sites feels like a much better site-to-site VPN implementation to alternatives as it supports all the scenarios that plague traditional site-to-site VPN setup in Unifi, such as when you failover to your secondary WAN connection, or dynamic IP addresses / FQDN support, etc. But, figuring out how to route traffic over the magic site VPN remains a mystery to me -- if we could get split-vpn to work with magic sites it would be awesome!

Running ip route show, I see similar output to @sidprax (my remote network reached via the VPN is 192.168.2.0/24):

ip route show 192.168.2.0/24
192.168.2.0/24 via 192.168.1.0 dev wgsts1000 proto ospf metric 20 onlink

The 192.168.1.0 is not a routable address, as @peacey noted, but it's what appears here. Note if you just run ip route show you will see this, note the two entries related to the wireguard site-to-site VPN (wgsts1000):

24.171.201.1 dev ppp0 proto kernel scope link src 70.45.6.247   # My Primary Internet 
100.64.0.0/10 dev eth7 proto kernel scope link src 100.100.44.63  # My Starlink Secondary
192.168.0.0/24 dev br0 proto kernel scope link src 192.168.0.1 
192.168.1.0 dev wgsts1000 proto kernel scope link
192.168.2.0/24 via 192.168.1.0 dev wgsts1000 proto ospf metric 20 onlink

I have noticed that even though the address shown is 192.168.1.0 (not routable), going to 192.168.1.1. brings up the UDM Pro (even though my local network is 192.168.0.1.

In vpn.conf, I have experimented getting split-vpn to work with the following settings, to no avail:

By saying "reaching the FORCED destinations does not work", what I mean is, consider this example, whatismyip.com is forced using IP sets, its IP address is 172.67.189.152. I attempt to ping it from my local network host (192.168.0.84) while running tcpdumnp on the local UDM Pro.

Behavior when split-vpn is DOWN / turned off (normal, expected behavior, going out over the WAN interface):

#  tcpdump -ni any host 172.67.189.152
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
11:45:27.387920 switch0 In  IP14 (invalid)
11:45:27.387920 switch0.1 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 59900, seq 0, length 64
11:45:27.387920 br0   In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 59900, seq 0, length 64
11:45:27.387980 ppp0  Out IP 70.45.6.247 > 172.67.189.152: ICMP echo request, id 59900, seq 0, length 64
11:45:27.416340 ppp0  In  IP 172.67.189.152 > 70.45.6.247: ICMP echo reply, id 59900, seq 0, length 64
11:45:27.416382 br0   Out IP 172.67.189.152 > 192.168.0.84: ICMP echo reply, id 59900, seq 0, length 64
11:45:27.416387 switch0.1 Out IP 172.67.189.152 > 192.168.0.84: ICMP echo reply, id 59900, seq 0, length 64

Behavior when split-vpn is UP / turned on, with VPN_ENDPOINT_IPV4=192.168.1.1 -- appears to be in a loop:

#  tcpdump -ni any host 172.67.189.152
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
11:43:17.246175 switch0 In  IP14 (invalid)
11:43:17.246175 switch0.1 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.246175 br0   In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.246220 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.319732 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.319759 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.393499 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.393524 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.467263 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.467290 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.541941 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.541962 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.615379 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.615400 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.689211 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.689237 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.764531 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.764555 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.839995 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.840019 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.915014 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.915040 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.989137 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.989169 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
( continues like this ad infinitum)

Sorry I am not much of a networking whiz, but would love to get split-vpn working with the magic site stuff as it is so far superior to any other UDM solution for site-to-site VPN. Feedback / suggestions welcome.

jacobmr commented 9 months ago

Looks like this thread is stale. I'm curious if anyone has made any progress here. @peacey - given that the new low-cost express devices support Site Magic, I bet that those of us here who would love to have magic site would sponsor the purchase of one for you so that you can test/implement (if possible) support for magic site ... anyone else game for this?

rigwig commented 8 months ago

Looks like this thread is stale. I'm curious if anyone has made any progress here. @peacey - given that the new low-cost express devices support Site Magic, I bet that those of us here who would love to have magic site would sponsor the purchase of one for you so that you can test/implement (if possible) support for magic site ... anyone else game for this?

I would also be down to chip in if this would help dev work on this.

I've tried all combinations as some of previously mentioned to no avail as well.

10.0.1.0/24 via 192.168.1.1 dev wgsts1000 proto ospf metric 20 onlink 10.0.1.2 via 192.168.1.1 dev wgsts1000 proto ospf metric 20 onlink 10.1.0.0/24 dev br0 proto kernel scope link src 10.1.0.1 10.1.1.0/24 dev br21 proto kernel scope link src 10.1.1.1 10.1.2.0/24 dev br22 proto kernel scope link src 10.1.2.1

I do have a routeable address, 192.168.1.1, am able to bring the tunnel up without error, but no dice on the actual connection