openwrt / packages

Community maintained packages for OpenWrt. Documentation for submitting pull requests is in CONTRIBUTING.md
GNU General Public License v2.0
4.01k stars 3.49k forks source link

MWAN3 doesn't support correctly WAN based on OpenVPN tun #3486

Closed etomm closed 6 years ago

etomm commented 8 years ago

MWAN3 is not able to recognize when a OpenVPN based “wan interface” is online or not. This is due to the fact that lib/functions/network.sh network_get_gateway uses ubus to collect the gateway information and in case of openvpn ubus return a perfectly emtpy result. I modified the network_get_gateway function to use "ip addr, grep and awk" and it began to work.

This is not the case with CC 15.05 that was having a really more simple hotplug script for mwan3.

All the tests are being done on LEDE trunk on a Linksys EA8500. Before, in OpenWRT CC 15.05 on a Archer C7 everything was working correctly.

ghost commented 7 years ago

Hey etomm,

Any chance you could create patch based on your local copy of network.sh, showing your changes please? I'm seeing the same issue, and would like to get things working.

etomm commented 7 years ago

I really have few time to look into the modifications, but I can try to collect something. Difficultly would be a patch, I did it live on a working LEDE system.

I still don't build my LEDE and relay on prebuild firmwares.

ghost commented 7 years ago

I definitely don't want to cause you work, is there any chance you could paste in network.sh? I'm happy to diff out the, uh, differences :)

nidstigator commented 7 years ago

Tried to change network.sh network_get_gateway to this:

network_get_gateway() { local $tmp tmp="$(/sbin/ip route | grep '^default' | awk "/$2/ {print \$3; found=1} END{exit !found}") eval "$__tmp" && \ return 0 }

however, it's not working.

is network_get_gateway is supposed to get the gateway or check if gateway belongs to interface?

@etomm Any help with this will is appreciated!

etomm commented 7 years ago

You are right, I completely forgot the issue... too much things to think about. So I opened the network.sh and I modified the network_get_gateway like this:

# determine IPv4 gateway of given logical interface
# 1: destination variable
# 2: interface
# 3: consider inactive gateway if "true" (optional)
network_get_gateway() {
    __network_ifstatus "$1" "$2" ".route[@.target='0.0.0.0' && !@.table].nexthop" "" 1 && \
        return 0

    [ "$3" = 1 -o "$3" = "true" ] && \
        __network_ifstatus "$1" "$2" ".inactive.route[@.target='0.0.0.0' && !@.table].nexthop" "" 1

    network_get_physdev ___dummydev "$2"

    [ -z "$___dummydev" ] && return 1
    ___gateway=`ip addr show $___dummydev | grep peer | awk -F'peer |/' '{printf $2}'`
    unset ___dummydev   

    [ -z "$___gateway" ] && return 1
    eval "$1=$___gateway"
    unset ___gateway

    return 0
}

I added my rows to the ones that were just present... right now I don't remember which one they were. I think from when the _dummydev variable is appearing until the unset ___gateway.

Hope this can help both of you.

nidstigator commented 7 years ago

@etomm Hey man, thanks for sharing your changes. Just replaced my network_get_gateway() function on my setup (WRT1900ACS LEDE r1967), still not working. Tun0 wan always shows up offline no matter what settings I change....

etomm commented 7 years ago

I'm sorry it didn't work for you. That time I remember I debugged the script adding various log messages before catching the failing function was that one. Did you try to do the same?

nidstigator commented 7 years ago

Thanks @etomm. I have tried. The problem is the same you were having - empty gateway info. However this fix and others I've tried didn't work.

etomm commented 7 years ago

Maybe a bit of configuration files would help. Like Network, OpenVPN and MWAN3. But for example to me sound strange that you are doing it in tun0? is it the Device name or the Interface name?

dziny commented 7 years ago

Hi etomm, would it be possible to post your modified /lib/functions/network.sh somewhere? Say using dropbox link? This way we can try whether your version works for us. The files should not contain any personal info anyway so it should be safe to be posted.

etomm commented 7 years ago

Here it is http://pastebin.com/3py8Pesc but I still suppose a misconfiguration.

nidstigator commented 7 years ago

@etomm I have uninstalled all relevant packages along with all configs. I wasted 2 days on this. Just a heads up everyone watching this thread, a new package called vpnbypass was merged into openwrt/packages yesterday. Should have same functionality (but easier to configure via its luci app) but confirmed working on latest LEDE and Openwrt.

nidstigator commented 7 years ago

I believe the package is dead. Try OpenVPN-Policy-Based-Routing. Here: https://forum.lede-project.org/t/openvpn-policy-based-routing-web-ui-testers-needed/1422/156

hnyman commented 7 years ago

I believe the package is dead.

Not quite, but the previous maintainer went MIA.

@feckert has been doing mwan3 stuff lately.

jscinoz commented 7 years ago

Aside from the issue with network_get_gateway, network_get_ipaddr does not return the correct information for OpenVPN interfaces, due to the ipv4-addreses (and ipv6-addresses, for that matter) block being empty in the output of ubus call network.interface dump. This results in an invalid iptables call on line 168 of mwan3.sh as $src_ip is blank.

etomm commented 7 years ago

@nidstigator mwan3 is not tied just to OpenVPN. It is useful for whichever redirection policy and I use it together with ipsets.

Maybe we should move the but to the OpenVPN package?

feckert commented 7 years ago

@nidstigator i am not sure if this is a problem of mwan3. What is the setup and configuration?

nidstigator commented 7 years ago

@hnyman @jscinoz I apologise if I was misleading, I didn't mean to. I assumed the package is dead due to lack of updates and communication regarding this issue and other issues.

@feckert I gave up a long time ago on using mwan3 for my split tunnelling setup, and have been using OpenVPN-Policy-Based-Routing. mwan3 doesn't recognise that tun interfaces are up/down. I am happy with my current setup and don't need to use mwan3 anymore. But thanks for asking.

@etomm I agree, that was an oversight on my part. Mwan3 isn't the same as OPBR.

etomm commented 7 years ago

@feckert almost sure it is a problem of OpenVPN that doesn't fill ubus informations correctly. I don't know if it should be solved OpenVPN side or MWAN3 side.

With my modified scripts I bypassed the problem, but I don't think it is the right solution. Maybe the maintainer of OpenVPN should look at this.

jscinoz commented 7 years ago

Related: https://github.com/openwrt/openwrt/issues/220

jscinoz commented 7 years ago

After a bit of investigation, I've come to the same conclusion as @etomm - this issue is due to the network.interface information for OpenVPN interfaces (well, tun interfaces in general) not being populated correctly.

There's an outstanding question here as to the expected behaviour however: Should netifd be populating this information automatically whenever an IP address change occurs on an interface (regardless of type)

This can be done at a user-level with a custom script for OpenVPN's --up option, but it would be nicer to have this properly integrated. I imagine the simplest way to do this is to amend the OpenVPN init script to include the appropriate --up and --down options in the generated OpenVPN config file.

We would also need to figure out how to do this while still supporting user-provided --up/--down scripts - perhaps simply passing through the configured path to the user's custom scripts and invoking these at the end of the netifd-integration up/down script.

noname77 commented 7 years ago

i just want to report that replacing /lib/functions/network.sh with @etomm 's version from http://pastebin.com/3py8Pesc worked for me on Software versions:

OpenWrt - Lede Reboot SNAPSHOT r4651-a6f6f8d
LuCI - git-17.209.51293-4c9ae3f

mwan3 - 2.5.3-5
mwan3-luci - git-17.209.51293-4c9ae3f-1

relevent Output of "cat /etc/config/network":

config interface 'vpn_jp'
    option ifname 'tun-jp'
    option proto 'dhcp'
    option metric '300'

not sure how much of the following is necessary, but here it is anyway:

i have added this to the /etc/openvpn/vpn_jp.conf

route-nopull
pull
script-security 2
up /root/scripts/vpn-jp-up.sh

cat /root/scripts/vpn-jp-up.sh:

#!/bin/ash

IFACE=tun-jp
METRIC=$(uci get network.vpn_jp.metric)

sleep 5

GW=$(ifconfig $IFACE | awk '/P-t-P:/ {print $3}' | cut -d ":" -f2)

/sbin/ip route add default via $GW dev $IFACE metric $METRIC

now mwan3 restart now shows

Bad argument `mark'
Try `iptables -h' or 'iptables --help' for more information.
uci: Entry not found
uci: Entry not found

Checking ip rules: (before replacement it failed)

All required interface IP rules found:

1003:   from all iif tun-jp lookup main 
2003:   from all fwmark 0x300/0xff00 lookup 3

Checking routing tables still fails:

Missing required interface routing table 3

but routing through vpn works fine (yay!)

also make sure to place your VPN rule above the default rule.

THANKS :)

trungpham commented 7 years ago

Do you guys know if this issue is fixed in LEDE trunk?

Also, is this a related issue? https://github.com/openwrt/packages/issues/3110

rhunwicks commented 7 years ago

It is not fixed in LEDE trunk as far as I am aware. There is a PR at https://github.com/lede-project/source/pull/1103 which integrates OpenVPN into the the netifd subsystem. That is probably the correct approach. But I tried using those netifd files with 17.01.3 and they didn't work, so I hacked /etc/functions/network.sh as described above. @etomm 's version didn't work as-is but based on it I have a working version for 17.01.3 that involves changing network_get_gateway and network_get_ipaddr. It doesn't aim to be comprehensive - merely sufficient to get mwan3 working with openvpn:

--- /lib/functions/network.sh.orig  2017-10-17 22:12:18.000000000 +0000
+++ /lib/functions/network.sh   2017-10-17 22:24:11.000000000 +0000
@@ -23,6 +23,18 @@
 # 2: interface
 network_get_ipaddr() {
    __network_ifstatus "$1" "$2" "['ipv4-address'][0].address";
+                         
+        network_get_physdev ___dummydev "$2"
+                       
+        [ -z "$___dummydev" ] && return 1                              
+        ___ipaddr=`ip addr show $___dummydev | grep "inet " | awk -F'inet | peer|/' '{printf $2}'`             
+        unset ___dummydev
+ 
+        [ -z "$___ipaddr" ] && return 1
+        eval "$1=$___ipaddr"                            
+        unset ___ipaddr  
+              
+        return 0      
 }

 # determine first IPv6 address of given logical interface
@@ -191,6 +203,18 @@

    [ "$3" = 1 -o "$3" = "true" ] && \
        __network_ifstatus "$1" "$2" ".inactive.route[@.target='0.0.0.0' && !@.table].nexthop" "" 1
+                                                                                                                       
+        network_get_physdev ___dummydev "$2"                                                    
+                                                                                                                       
+        [ -z "$___dummydev" ] && return 1                                                                              
+        ___gateway=`ip addr show $___dummydev | grep peer | awk -F'peer |/' '{printf $2}'`              
+        unset ___dummydev                                                                                  
+                                                                                                
+        [ -z "$___gateway" ] && return 1                                                        
+        eval "$1=$___gateway"                                                                           
+        unset ___gateway                                                                                               
+                                                                                                        
+        return 0                                                                                           
 }

 # determine IPv6 gateway of given logical interface
etomm commented 7 years ago

Yes mine is quite old at this time and unfortunately I didn't think at making a patch version those days... I was doing it overnight!

Good job!

On Oct 18, 2017 00:26, "rhunwicks" notifications@github.com wrote:

It is not fixed in LEDE trunk as far as I am aware. There is a PR at lede-project/source#1103 https://github.com/lede-project/source/pull/1103 which integrates OpenVPN into the the netifd subsystem. That is probably the correct approach. But I tried using those netifd files with 17.01.3 and they didn't work, so I hacked /etc/functions/network.sh as described above. @etomm https://github.com/etomm 's version didn't work as-is but based on it I have a working version for 17.01.3 that involves changing network_get_gateway and network_get_ipaddr. It doesn't aim to be comprehensive - merely sufficient to get mwan3 working with openvpn:

--- /lib/functions/network.sh.orig 2017-10-17 22:12:18.000000000 +0000 +++ /lib/functions/network.sh 2017-10-17 22:24:11.000000000 +0000 @@ -23,6 +23,18 @@

2: interface

network_get_ipaddr() { __network_ifstatus "$1" "$2" "['ipv4-address'][0].address"; +

  • network_get_physdev ___dummydev "$2"
  • [ -z "$___dummydev" ] && return 1
  • ipaddr=`ip addr show $dummydev | grep inet | awk -F'inet | peer' '{printf $2}'`
  • unset ___dummydev
  • [ -z "$___ipaddr" ] && return 1
  • eval "$1=$___ipaddr"
  • unset ___ipaddr
  • return 0 }

    determine first IPv6 address of given logical interface

    @@ -191,6 +203,18 @@

    [ "$3" = 1 -o "$3" = "true" ] && \ __network_ifstatus "$1" "$2" ".inactive.route[@.target='0.0.0.0' && !@.table].nexthop" "" 1

  • network_get_physdev ___dummydev "$2"
  • [ -z "$___dummydev" ] && return 1
  • gateway=`ip addr show $dummydev | grep peer | awk -F'peer |/' '{printf $2}'`
  • unset ___dummydev
  • [ -z "$___gateway" ] && return 1
  • eval "$1=$___gateway"
  • unset ___gateway
  • return 0 }

    determine IPv6 gateway of given logical interface

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openwrt/packages/issues/3486#issuecomment-337393328, or mute the thread https://github.com/notifications/unsubscribe-auth/AF05vyrQpDjr3A0jihSics_KawGU6UOkks5stSmwgaJpZM4Kq4FZ .

mfullerca commented 7 years ago

I just upgraded to LEDE 17.01.4 from OpenWRT Chaos Calmer 15.05.1. Previously I used mwan3 to selectively route traffic over OpenVPN and now it no longer works. I debugged it to the point of determining that it doesn't even start mwan3track for the tun interface, then found this bug.

I installed the above patch quite easily, but it doesn't help. "ubus call network.interface dump" has no IP info for the tun interface, which I assume is part of the problem. Any ideas? 17.01.3 and 17.01.4 network.sh seem the same, so I'm at a loss.

dziny commented 7 years ago

Have a look here. https://forum.lede-project.org/t/openvpn-policy-based-routing-web-ui-testers-needed/1422 It replaces the need of using mwan3 for openvpn routing.

On Mon, Oct 23, 2017 at 4:09 PM, mfullerca notifications@github.com wrote:

I just upgraded to LEDE 17.01.4 from OpenWRT Chaos Calmer 15.05.1. Previously I used mwan3 to selectively route traffic over OpenVPN and now it no longer works. I debugged it to the point of determining that it doesn't even start mwan3track for the tun interface, then found this bug.

I installed the above patch quite easily, but it doesn't help. "ubus call network.interface dump" has no IP info for the tun interface, which I assume is part of the problem. Any ideas? 17.01.3 and 17.01.4 network.sh seem the same, so I'm at a loss.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/openwrt/packages/issues/3486#issuecomment-338690706, or mute the thread https://github.com/notifications/unsubscribe-auth/AElYeu5GYm9MpU7-yxtsWztvmrdmvdkhks5svKwNgaJpZM4Kq4FZ .

rhunwicks commented 7 years ago

@mfullerca the patch doesn't fix ubus call network.interface dump so you shouldn't expect any change in that command. It patches /lib/functions/network.sh which is what mwan3 actually calls. To see if it is working try:

VPN=myvpnname
source /lib/functions/network.sh 
network_get_gateway gw $VPN
echo $gw
network_get_ipaddr ip $VPN
echo $ip
noname77 commented 7 years ago

@mfullerca if you have already applied @rhunwicks's patch you might try calling /usr/sbin/mwan3 ifup $NET_IFACE for your tun interface. i think this starts the mwan3track for the interface and then monitors the connection fine.

i have that executed in my vpn up script

mfullerca commented 7 years ago

@rhunwicks thanks! Interesting: $ip is set but $gw is not. netstat -re shows that the corresponding tun interface has correct gateway entries, and the default has the metric mwan3 expects.

rhunwicks commented 7 years ago

@mfullerca can you post your whole /lib/functions/network.sh to a pastebin somewhere and link it here.

mfullerca commented 7 years ago

@rhunwicks https://pastebin.com/M3GLj9Db

Btw I just rebooted to verify that there was no weirdness as I'd been changing many things before I made my last post and still $ip is set but $gw is not.

I should also add I found a work-around, albeit a poor one: if I change the tun interface from "Unmanaged" to "Static" and assign it the values it is already getting when the VPN connection is established, then it works.

mfullerca commented 7 years ago

@rhunwicks also here's the results of running your debugging script under "sh -x": https://pastebin.com/PWnMwgmJ

rhunwicks commented 7 years ago

@mfullerca what's the result of ip addr show tun-do

mfullerca commented 7 years ago

@rhunwicks (btw sorry to take so long, had to find a good time to revert the work-around):

15: tun-do: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 100
    link/none 
    inet 192.168.7.6/24 brd 192.168.7.255 scope global tun-do
       valid_lft forever preferred_lft forever
yangfl commented 7 years ago

Hi, I'm trying to (maybe partially) fix it https://github.com/openwrt/packages/pull/5033 . Please take a look at it.

rhunwicks commented 7 years ago

@mfullerca that ip addr show seems to show a local link with a broadcast address, rather than a gateway. Are you sure that the OpenVPN config is adding a default gateway route?

mfullerca commented 7 years ago

@rhunwicks I'm not sure offhand what you're looking for (sorry, I was a sysadmin 1987-2001 so my networking skills are a bit dated). There is a default route for that interface:

# netstat -re | grep tun-do
default         192.168.7.1     0.0.0.0         UG    3      0        0 tun-do
192.168.7.0     *               255.255.255.0   U     0      0        0 tun-do

That route comes from /etc/config/openvpn via the client-side option:

option route '0.0.0.0 0.0.0.0 192.168.7.1 3'

AFAIK there's only two other options with OpenVPN:

  1. Don't do anything, in which case there's no default route and mwan3 won't work because it requires a default route with an appropriate metric.
  2. Let the server push a default route, but OpenVPN does two /1 CIDR routes to effectively create a default route of highest priority because the routes are more specific than /0, in which case mwan3 also won't work.
rhunwicks commented 7 years ago

@mfullerca I'm not a sysadmin at all, so I probably can't be any more help, other than to say that my .ovpn file looks like:

route-nopull
route 0.0.0.0 0.0.0.0 vpn_gateway 80

where I think that vpn_gateway acts as a keyword that inserts the correct gateway address.

and my ip addr show includes the gateway:

~$ ip addr show tun1
41: tun1: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 100
    link/none
    inet 10.8.3.86 peer 10.8.3.85/32 scope global tun1
       valid_lft forever preferred_lft forever

The workaround I added to /lib/functions/network.sh is matching the gateway from the peer to the / in the line starting inet - but your interface doesn't show the gateway.

yangfl commented 7 years ago

I've seen one interesting line in https://github.com/openwrt/packages/blob/d971514af85739418c8f7314a143016492dc280e/net/mwan3/files/lib/mwan3/mwan3.sh#L240 I doubt it is unnecessary since it can be simply -A mwan3_iface_out_$1 -o $interface -j MARK --set-xmark $(mwan3_id2mask id MMX_MASK)/$MMX_MASK IIRC. If that is true then mwan3 can remove one dependency on ubus.

@feckert

mfullerca commented 7 years ago

@rhunwicks I didn't know about the vpn_gateway keyword, but otherwise believe my config is the same as yours except that I hardcoded the gateway IP. I changed to the vpn_gateway keyword and nothing changed in output or behaviour (except my config is now robust to changes in addressing at the server end).

I diffed your output and mine and noticed that basically yours says:

inet <naked addr of interface> peer </32 addr of gateway>

whereas mine says:

inet </24 addr of subnet> brd <broadcast addr of subnet>

Otherwise they are pretty much the same. Sadly documentation on the meaning of the output of various commands is lacking, but I hypothesized a difference in OpenVPN network topology. I own the server on the other end, so I changed --topology from subnet (the default in modern OpenVPN configs) to p2p. Now my output matches yours and mwan3 recognizes the interface. Thanks!

dibdot commented 6 years ago

seems to be fixed - let's close this epic ticket.