peacey / split-vpn

A split tunnel VPN script for Unifi OS routers (UDM, UXG, UDR) with policy based routing.
GNU General Public License v3.0
813 stars 56 forks source link

bug: packets not routed when trying to access remote vpn ip address from vpn's network #31

Closed midzelis closed 3 years ago

midzelis commented 3 years ago

Note, IPs are made up

Situation: I have a VPN server running on public ip (142.250.64.238) A UDM VPN client sets up a VLAN (42)/subnet (10.42.42.0/24) to transparently proxy all traffic via VPN server 142.250.64.238

Problem: Cannot access other services (http/ssh) running on 142.250.64.238

Diagnosis attempted: Ran tcpdump on VPN servers's tun interface. Ping'd 1.1.1.1 - saw packets. Ping's server's public address (142.250.64.238) - no packets received Ran tcpdump on UDM VPN clients tun interface. Ping'd 1.1.1.1 - saw packets. Pinged Ping's server's public address (142.250.64.238) - no packets received.

My suspicion is that there is a missing route, or maybe packets aren't being marked correctly, and non-VPN packets destined for the VPN server itself (but not the VPN service) are not being routed properly.

Edit: Found my pings were being sent to interface switch0 (on the UDM client) but there were no replies.

peacey commented 3 years ago

Hi @midzelis,

Why are you pinging the server's public IP instead of the server's VPN IP? Does the server have a VPN IP that you can use instead to access those services? It doesn't really make sense to use the public IP over the VPN tunnel. You should be able to access the services through the VPN IP of the remote client or server, or pass the remote subnet as a OpenVPN server option to add that route through the VPN if the subnet is different than the VPN subnet.

If you look at the route table on the UDM for the VPN (i.e. run: ip route show table 101), you will see that one of the routes is your VPN server's public IP routed through the WAN interface. This is to ensure that all encrypted traffic to the VPN travels through the WAN, instead of going through the VPN tunnel which would make no sense (since it would be looping). So that's why pinging the VPN's public IP is resulting in the packets travelling through the WAN and not the VPN tunnel.

peacey commented 3 years ago

Sorry, I think I misunderstood what you were trying to do.

So you want to be able to access services on the VPN's public IP from a forced client, and you do want this to travel through the WAN and not the VPN. And this is currently not working.

Is this correct? I will debug on my end and see what's going on.

midzelis commented 3 years ago

So, the remote server has a bunch of services. OpenVPN, Http, SSH and others. They are all share one IP address (the public IP address). All of the services (http/ssh) are set up via port-forwards (on the public ip) to various other internal hosts.

What I'm trying to do is access the http/ssh services that are hosted on the same ip as the vpn server, from the vpn tunnel.

i.e. from the UDM client, with OpenVPN interface tun_VPN (that links to the VPN server) and eth4 which is non-VPN'd.

via VPN tunnel

# ping my-server.com -I tun_VPN
PING ping my-server.com (xxx.xxx.xxx.xxx): 56 data bytes
^C
--- my-server.com ping statistics ---
6 packets transmitted, 0 packets received, 100% packet loss

vs non-VPN

# ping my-server.com -I eth4
PING my-server.com (xxx.xxx.xxx.xxx): 56 data bytes
64 bytes from xxx.xxx.xxx.xxx: seq=0 ttl=48 time=97.183 ms
64 bytes from xxx.xxx.xxx.xxx: seq=1 ttl=48 time=85.818 ms
^C
--- my-server.com ping statistics ---
3 packets transmitted, 2 packets received, 33% packet loss
round-trip min/avg/max = 85.818/91.500/97.183 ms

I think this MAY be considered a form of hairpinning.

What I really want is packets destined for the VPN-server IP host to go thru the VPN tunnel to the VPN public IP. (hairpin?) I would be ok with packet destined to to the VPN-server IP host to go thru WAN instead. (like EXEMPT_DESTINATIONS_IPV4 - but using a domain instead of the ip, since the server is on a dynamic IP, so if the script could nslookup the server and use its ip)

Right now, the packets are dropped. (but only the packets to the server's ip)

peacey commented 3 years ago

So the default options of this script add a route to the custom table that routes the VPN's public IP through your WAN, so this should already be working fine unless you set GATEWAY_TABLE option to disabled. I just tried this on my end from a VPN forced client and I was able to access the public IP of the VPN server and services on it.

Can you please show me the output of ip route show table 101 after running the VPN? There should be a route like VPN_PUBLIC_IP via GATEWAY_IP dev WAN_DEVICE.

If you instead wanted to do hairpin NAT so that traffic to the VPN IP is sent through the tunnel for forced clients instead of the default WAN, we would need to add a DNAT rule like this (add this after you run the VPN script):

iptables -t nat -A PREROUTING -m mark --mark 0x9 -d VPN_SERVER_PUBLIC_IP -j DNAT --to VPN_SERVER_VPN_IP

This DNAT rule worked for me to route the VPN public IP through the VPN for VPN-forced clients. However, I'm not sure if this will cause any unintended issues as I haven't tested it that thoroughly with all configurations.

So which way would you prefer, routing it through the WAN (which should work by default as long as GATEWAY_TABLE is not disabled), or routing it through the VPN tunnel? If you prefer the VPN tunnel, I can add an option to the script to add a hairpin rule for the VPN.

Please test both options and tell me if either works for you. Thanks!

midzelis commented 3 years ago
# ip route show table 101
0.0.0.0/1 via 10.99.99.5 dev tun_VPN
blackhole default
24.x.x.243 via 192.168.42.1 dev eth4
128.0.0.0/1 via 10.99.99.5 dev tun_VPN

your iptables command to add the hairpin nat did not work.

It is a mystery as to why my packets aren't making it to the vpn interface. I mean

-A VPN_PREROUTING -i br42 -j MARK --set-xmark 0x9/0xffffffff

should mark all packets from the br42 (VLAN 42) and send them to that table above. And since 24.x.x.243 is inside of 0.0.0.0/1, it should go to tun_VPN but for some reason it isn't. as another test, I tried adding 1 to the server ip address, and listened to it using tcpdump, like so

# tcpdump -n host 24.x.x.244 -i tun_VPN
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tun_VPN, link-type RAW (Raw IP), capture size 262144 bytes
14:11:02.327118 IP 10.99.99.6 > 24.x.x.244: ICMP echo request, id 6144, seq 0, length 64
14:11:02.462277 IP 24.x.x.244 > 10.99.99.6: ICMP echo reply, id 6144, seq 0, length 64
14:11:03.351161 IP 10.99.99.6 > 24.x.x.244: ICMP echo request, id 6144, seq 1, length 64
14:11:03.462164 IP 24.x.x.244 > 10.99.99.6: ICMP echo reply, id 6144, seq 1, length 64

====== So, even tho this mystery is getting to me, there is a workaround: Adding to EXEMPT_DESTINATIONS_IPV4 does work, it would be nice to resolve domains to ips when the script runs, just in case my dynamic ip changes.

This yields the following mangle rule table:

-A VPN_PREROUTING -i br42 -j MARK --set-xmark 0x9/0xffffffff
-A VPN_PREROUTING -d 24.x.x.243/32 ! -i tun_VPN -m mark --mark 0x9 -j MARK --set-xmark 0x0/0xffffffff
-A VPN_PREROUTING -d 192.168.58.0/24 ! -i tun_VPN -m mark --mark 0x9 -j MARK --set-xmark 0x0/0xffffffff
-A VPN_PREROUTING -d 10.43.43.0/24 ! -i tun_VPN -m mark --mark 0x9 -j MARK --set-xmark 0x0/0xffffffff
-A VPN_PREROUTING -d 10.42.42.0/24 ! -i tun_VPN -m mark --mark 0x9 -j MARK --set-xmark 0x0/0xffffffff

(As an explanation, i'm excluding 192.168.58.0/24, which is LAN, and 10.43.43.0/24 (which is VLAN43, and another diff VPN), and 10.42.42.0/24 (which is VLAN42) which is this VPN)

In this case, this will FORCE the connection from br42 to to skip table 101 (due to not being marked) and go thru normal routing tables out eth4

For my purposes thats ok. But that mystery is really starting to nag at me!

It still would be nice if the script could resolve domains to ips when the script runs, just in case my dynamic ip changes.

peacey commented 3 years ago

It still would be nice if the script could resolve domains to ips when the script runs, just in case my dynamic ip changes.

This script can already resolve domains to IPs for most options but it's not documented. iptables resolves domains automatically, so you can actually just use a domain in EXEMPT_DESTINATIONS_IPV4 and it should resolve automatically.

And since 24.x.x.243 is inside of 0.0.0.0/1, it should go to tun_VPN but for some reason it isn't.

It will not go through the VPN tunnel because there is a more direct route that you see in your table, which takes precedence:

24.x.x.243 via 192.168.42.1 dev eth4

So anything going to 24.x.x.243 will be routed through the main WAN. You should be listening to tcpdump on eth4 to see those packets. If you confirm the packets are travelling to 192.168.42.1 with tcpdump, can you also confirm that something from 192.168.42.1 is not preventing them from going to their final destination?

your iptables command to add the hairpin nat did not work.

Hmm, it worked fine on my end and I checked with tcpdump on the server that packets arrived on the tunnel. You ran this command after VPN tunnel was up, right?

   iptables -t nat -A PREROUTING -m mark --mark 0x9 -d 24.x.x.243 -j DNAT --to 10.99.99.5
midzelis commented 3 years ago

This script can already resolve domains to IPs for most options but it's not documented. iptables resolves domains automatically, so you can actually just use a domain in EXEMPT_DESTINATIONS_IPV4 and it should resolve automatically.

neat

So anything going to 24.x.x.243 will be routed through the main WAN. You should be listening to tcpdump on eth4 to see those packets. If you confirm the packets are travelling to 192.168.42.1 with tcpdump, can you also confirm that something from 192.168.42.1 is not preventing them from going to their final destination?

Ok, looks like packets are NOT making it to eth4, but they are making it to switch0

Hmm, it worked fine on my end and I checked with tcpdump on the server that packets arrived on the tunnel. You ran this command after VPN tunnel was up, right?

iptables -t nat -A PREROUTING -m mark --mark 0x9 -d 24.x.x.243 -j DNAT --to 10.99.99.5

I made a typo here. I can verify that this hairpin route works for me as well.

edit: the hairpin rule doesn't quite do what I want - it basically gives me a view of as if it was accessed via the LAN ip. that is, its giving me the webserver of the remote UDM router, not the WAN port-forwarded webserver.

peacey commented 3 years ago

Ok, looks like packets are NOT making it to eth4, but they are making it to switch0

They should show up in eth4 so that's odd. I'm not sure what switch0 is on the UDM. A tcpdump on eth4 reveals no packets?

Can you run tcpdump -ni eth4 host 24.x.x.243 and icmp on the UDM, then run ping 24.x.x.243 on a VPN-forced client and see if there are any packets? Also please make sure to delete the hairpin rule we added first before you do this.

the hairpin rule doesn't quite do what I want - it basically gives me a view of as if it was accessed via the LAN ip. that is, its giving me the webserver of the remote UDM router, not the WAN port-forwarded webserver.

If the VPN server is a router, then the port forwards are only setup for traffic coming from the WAN interface on the router itself. You would have to add port forwards for traffic originating from the tunnel by adding custom iptables rules in your VPN server.

So it seems it makes more sense for your use case to access the services through the WAN rather than the VPN tunnel, since your port forwards on the server aren't setup for the tunnel. I'm not sure why the regular WAN route isn't working for you, but at least you got it working through the EXEMPT options so that's good.

midzelis commented 3 years ago

Ok, looks like packets are NOT making it to eth4, but they are making it to switch0

They should show up in eth4 so that's odd. I'm not sure what switch0 is on the UDM. A tcpdump on eth4 reveals no packets?

Can you run tcpdump -ni eth4 host 24.x.x.243 and icmp on the UDM, then run ping 24.x.x.243 on a VPN-forced client and see if there are any packets? Also please make sure to delete the hairpin rule we added first before you do this.

Correct. This command yields no packets. tcpdump -ni br42 host 24.x.x.243 and icmp does show requests, but no replies.

So it seems it makes more sense for your use case to access the services through the WAN rather than the VPN tunnel, since your port forwards on the server aren't setup for the tunnel. I'm not sure why the regular WAN route isn't working for you, but at least you got it working through the EXEMPT options so that's good.

Yes, I'm using the domain name in the EXEMPT options, and it seems to be working.

My only guess at this point may be that the ip rule are being applied to the routed packet and its getting stuck in some kind of loop. I have no idea. (edit: I don't think this is the case, since it doesn't seem like the packet is making it to eth4 at all.)

I'm using UDM 1.9.3 (Linux UDM-GT 4.1.37-v1.9.3.3438-50c9676), in case this is some kind of kernel packet bug.

peacey commented 3 years ago

Hi @midzelis,

Since you can just EXEMPT the VPN's public WAN as you did, I don't think we have to pursue this further as it's working fine for you with this configuration.

Is there anything else you wanted cleared up or fixed, or can we close this?

midzelis commented 3 years ago

Sounds good - closing!