Amebis / eduVPN

Windows eduVPN Client
GNU General Public License v3.0
41 stars 16 forks source link

Windows default gateway when on VPN #136

Closed ghost closed 2 years ago

ghost commented 4 years ago

It seems some applications on Windows, e.g. Office 365 freak out when eduVPN (or other VPN) is enabled and is not the "default gateway".

The OpenVPN server pushes the def1 flag which is basically a "split tunnel" configuration that somehow prevents Office 365 from detecting it is "online".

def1 -- Use this flag to override the default gateway by using 0.0.0.0/1 and 128.0.0.0/1 rather than 0.0.0.0/0. This has the benefit of overriding but not wiping out the original default gateway.

This is a bit ugly. But it seemed to work until now.

Is there a way to make eduVPN (or OpenVPN?) "properly" set the default gateway, presumable using the 0.0.0.0/0 route?

Do we still need the def1 flag for "split tunnel default gateway"? Or is the default 0.0.0.0/0 route possible nowadays?

"not wiping out the original default gateway", is that still an issue on Windows 7, 8, 10? Why was this ever a problem? The default gateway can't be restored any more after disconnecting?

efef commented 4 years ago

Old reference from MS which seems related to this issue: https://docs.microsoft.com/en-us/archive/blogs/the_microsoft_excel_support_team_blog/office-2013-reports-no-internet-connectivity-with-vpn-connection

ghost commented 4 years ago

This is what you see in Windows by the way...

default_gw

rozmansi commented 4 years ago

It is possible to manipulate route table on Windows programmatically. However, not with the existing eduVPN Windows Client. Other than stock OpenVPN Interactive Service, it has no backend to run elevated.

Ideally, this should be solved within OpenVPN.

ghost commented 4 years ago

For the Linux client we probably also need more "profile info" from the server... So I am thinking of introducing an API call, e.g. /profile_info:

{
    "profile_info": {
        "ok": true,
        "data": {
            "internet": {
                "default_gateway": true
            },
            "institute": {
                "default_gateway": false
            }
        }
    }
}
rozmansi commented 4 years ago

https://syapps.zendesk.com/hc/en-us/articles/360022506431-Office-365-No-Internet-Connection-NordVPN-and-OpenVPN

ghost commented 4 years ago

https://github.com/eduvpn/vpn-user-portal/commit/2745d83da1bea3ead4ffd0d39ba56ad6a1efb6d8

whether or not the default gateway is supposed to be used is now exposed in the API. This may or may not help the eduVPN application to properly configure the network...

efef commented 4 years ago

More details about how NCSI works on Win10/Win8-7) and how to disable active probing in registry: https://support.microsoft.com/en-us/help/4494446/an-internet-explorer-or-edge-window-opens-when-your-computer-connects

It seems you can't simply disable active/passive probing "Microsoft Outlook may not be able to connect to a mail server, or Windows may not be able to download updates even if the computer is connected to the internet."

Some more interesting documentation can be found here about configuring your own NCSI server in the registry, the example 2 is about using 127.0.0.1 as NCSI server: https://github.com/DNSCrypt/dnscrypt-proxy/wiki/Windows-NCSI

efef commented 4 years ago

It seems that the server-side fix to push 0.0.0.0/0 (https://github.com/eduvpn/vpn-server-node/issues/44) is working nicely as workaround.

An university reports:

We've had no complaints from users after adding this workaround since early april, out of 1400 unique employee users we mainly have windows (82%) but also macOS (9%), iOS (5%) and android (2%) connected using the eduVPN-client.

ghost commented 4 years ago

I guess a real fix has to be investigated. I don't understand why Windows would "wipe" the default gateway and not simply add another one with a lower metric, like on Linux. When the interface disappears so does the VPN default gateway. Automagically!

Who/what manipulates the Windows routing table in OpenVPN? That's the openvpn.exe file run by the daemon? It runs some "ip" commands I guess?

How does Wireguard (+wintun) do this? In the same way? Is there a better way? There must be...

Ideally we can remove the def1 parameter when pushing default gateway to the clients and make it work with all clients... when simply indicating the VPN should be the default gateway.

def1 was already a workaround (for Windows?) and now we're talking about creating a workaround for a Windows workaround... :shrug:

efef commented 4 years ago

Let me be clear, it would be nice to have a better 'fix' but for now adding the 0.0.0.0/0 route is the short term solution

ghost commented 4 years ago

Let's spend some time on actually fixing it instead of blessing a hack without understanding all its consequences (IPv6?).

ghost commented 4 years ago

Screenshot from 2020-09-01 19-36-10

wintun driver. Apparently here it says it has Internet connection without the 0.0.0.0/0 route push...

ghost commented 4 years ago

The metric for the (LAN) interface that has the default gateway is 0. How to get lower than this with a VPN that also is default gateway? Microsoft documentation says the metric should be 5, but it is actually 0 on both virtual machines and physical machines (netsh interface ipv4 route show).

Otherwise we could simply have OpenVPN add a 0.0.0.0/0 route to the VPN IP with a lower metric than the existing default gateway (add add a X/32 route to the existing gateway to not drop all packets on the floor).

This should properly solve online detection without hacks.

netsh
interface
ipv4

add route 116.203.195.80/32 "Ethernet" 192.168.122.1
add route 0.0.0.0/0 "OpenVPN Wintun" 10.132.193.1 metric=4
add dnsservers "OpenVPN Wintun" 9.9.9.9

116.203.195.80 = vpn.tuxed.net 192.168.122.1 = LAN gateway 10.132.193.1 = VPN gateway IP

But how to get the metric lower than the existing one?!

What does work: turn off automatic metric on LAN interface in Windows, and set it to something high, e.g. 42. Then adding the 0.0.0.0/0 route with a lower metric actually works! So the only obstruction here is to be able to somehow (programmatically) get a lower metric than 0. I don't know how to do this...

ghost commented 4 years ago

Furthermore I noticed that if the server does NOT send the def1 redirect-gateway flag it will simply be added by the Windows client on the fly.

gvde commented 4 years ago

The metric for the (LAN) interface that has the default gateway is 0. How to get lower than this with a VPN that also is default gateway? Microsoft documentation says the metric should be 5, but it is actually 0 on both virtual machines and physical machines (netsh interface ipv4 route show).

I have just checked in my Windows 10 VM and there the metric for the default gateway is 25 on the default Ethernet card with automatic metric...

ghost commented 4 years ago

I have just checked in my Windows 10 VM and there the metric for the default gateway is 25 on the default Ethernet card with automatic metric...

Interesting! The documentation regarding this is here: https://docs.microsoft.com/en-us/troubleshoot/windows-server/networking/automatic-metric-for-ipv4-routes so I find it hard to explain seeing 0 anywhere (on 2 different VMs, on 1 physical machine).

rozmansi commented 4 years ago

IIRC, the metric of a route is the interface metric + some number determined by NLA according to traffic speed. But that doesn't explain why you are seeing 0.

I have 5 on the interface-specific route and 25 on the default route on my working computer.

ghost commented 4 years ago

Maybe I am looking wrong? What are you using to look up the metric?

image

gvde commented 4 years ago

I have just checked in my Windows 10 VM and there the metric for the default gateway is 25 on the default Ethernet card with automatic metric...

Interesting! The documentation regarding this is here: https://docs.microsoft.com/en-us/troubleshoot/windows-server/networking/automatic-metric-for-ipv4-routes so I find it hard to explain seeing 0 anywhere (on 2 different VMs, on 1 physical machine).

Interesting documention. I don't really understand when exactly which table applies (except for the WiFi table) and why my interface seems to be "other interfaces types" considering my 1 Gb/s ethernet connection on the VM and the 25 in the routing table...

gvde commented 4 years ago

Maybe I am looking wrong? What are you using to look up the metric?

I use the command prompt (cmd.exe) and run "route print"

===========================================================================
Interface List
 12...00 1c 42 96 ec 46 ......Intel(R) PRO/1000 MT Network Connection
 15...00 ff f1 9a c6 2e ......TAP-Windows Adapter V9
  1...........................Software Loopback Interface 1
===========================================================================

IPv4 Route Table
===========================================================================
Active Routes:
Network Destination        Netmask          Gateway       Interface  Metric
          0.0.0.0          0.0.0.0     192.168.22.1    192.168.22.42     25
        127.0.0.0        255.0.0.0         On-link         127.0.0.1    331
        127.0.0.1  255.255.255.255         On-link         127.0.0.1    331
  127.255.255.255  255.255.255.255         On-link         127.0.0.1    331
     192.168.22.0    255.255.255.0         On-link     192.168.22.42    281
    192.168.22.42  255.255.255.255         On-link     192.168.22.42    281
   192.168.22.255  255.255.255.255         On-link     192.168.22.42    281
        224.0.0.0        240.0.0.0         On-link         127.0.0.1    331
        224.0.0.0        240.0.0.0         On-link     192.168.22.42    281
  255.255.255.255  255.255.255.255         On-link         127.0.0.1    331
  255.255.255.255  255.255.255.255         On-link     192.168.22.42    281
===========================================================================
ghost commented 4 years ago

Ah, that looks different! Now the 5 makes sense according to the documentation... (the virtio interface is 100gbit).

image

rozmansi commented 4 years ago
netsh
interface
ipv4

add route 116.203.195.80/32 "Ethernet" 192.168.122.1
add route 0.0.0.0/0 "OpenVPN Wintun" 10.132.193.1 metric=4
add dnsservers "OpenVPN Wintun" 9.9.9.9

Yes, something like that should work. Mind that you need to monitor IP changes (e.g. when user switches their laptop from LAN to WiFi):

disable dead gw detection("OpenVPN Wintun")
netsh interface ipv4
    add route 0.0.0.0/0 "OpenVPN Wintun" 10.132.193.1 metric=4
    add dnsservers "OpenVPN Wintun" 9.9.9.9

setup IP monitoring(update routes, initial trigger)

update routes()
{
    netsh interface ipv4 add route 116.203.195.80/32 "<the new default gw interface>" 192.168.122.1
    set if metric(the new default gw interface, get if metric("OpenVPN Wintun") + 1)
}
gvde commented 4 years ago

Ah, that looks different! Now the 5 makes sense according to the documentation... (the virtio interface is 100gbit).

I think "netsh interface ipv4 show routes" shows the 'base metric' of the route. Windows then adds the metric according to the speed of the link, e.g. 281 is 25 + 256. That's why you see 0 in the netsh table.

ghost commented 4 years ago

Ah I see! So whatever metric you specify in netsh add route for example, it will always be on top of the metric of the interface itself. So when you specify metric 1 for e.g. the TAP driver you'd get 25+1 = 26. For wintun you get 5+1 (=6) because wintun is a 100gbit interface. So in some situations this will work if you use wintun, and your other interfaces are all < 2gbit, I guess?

ghost commented 4 years ago

Now the interesting question is: how to specify an absolute metric < 5 that is not added on top of the interface's "default" metric? :)

rozmansi commented 4 years ago

Set the interface metric to 1 and set route metric to 1. Should end up with effective route metric 1+1=2 which is less than 0+5=5 of the auto-metric configured link interface and its default route?

ghost commented 4 years ago

That sounds like a plan! In my experiments I changed the metric of the LAN adapter to 42 which also worked. But for doing this "for real" it makes more sense to change the metric of the TAP/wintun adapter...

ghost commented 3 years ago

Very old blog post that shows how to manually fix it by setting the default gateway, not super relevant here as we already know that, but to indicate when this problem started (at least): https://www.macwheeler.com/windows-10-office-365-cannot-connect-over-openvpn-fixed/

More links