openwrt / packages

Community maintained packages for OpenWrt. Documentation for submitting pull requests is in CONTRIBUTING.md
GNU General Public License v2.0
3.94k stars 3.46k forks source link

mwan3: should not use "ping -I address" #13655

Closed luizluca closed 3 years ago

luizluca commented 3 years ago

Maintainer: @feckert Environment: all of them (no binaries)

Description:

11564 breaks mwan3track in 19.07 for IPv6. Source IP does not specify which interface will be used to leave the router. It just sets the source address. It might be enough to select the outgoing if there is some external mechanism that selects the route based on source address, which will indirectly select the outgoing interface.

For example, it will work with "ip route from":

default from 2804:xxx:xxxx:1000::833 via fe80::yyy:yyy:yyy:5c9b dev eth0.2 proto static metric 512 pref medium
default from 2804:xxx:xxxx:2000::833 via fe80::yyy:yyy:yyy:4312 dev eth0.3 proto static metric 512 pref medium

If I "ping -I 2804:xxx:xxxx:1000::833", it will select (because of "route from") the first "default route" and use eth0.2, while "ping -I 2804:xxx:xxxx:2000::833" will indirectly use eth0.3. However, we cannot rely on that feature as the default route might not have a "from" by default (when ipv6 is static) or even the user disables (network.interface.wan6.sourcefilter='0', which affects only DHCPv6). Using "-I device" does work for me, as well it might break on some scenarios.

There is no way to add the "from xxx" using uci network. It is added by udhcpd. Besides that, only directly calling "ip route". I can simulate the same behavior with a set of rules:

config rule6
        option src '<wan6 ip address>/128' # it requires the /128 here
        option lookup '<mwan3 routing table used for wan6>'
        option priority 3100
config rule6
        option src '<wan6 delegated prefix>'
        option lookup '<mwan3 routing table used for wan6>'
        option priority 3101

config rule6
        option src '<wanb6 ip address>/128'
        option lookup '<mwan3 routing table used for wanb6>'
        option priority 3100
config rule6
        option src '<wanb6 delegated prefix>'
        option lookup '<mwan3 routing table used for wanb6>'
        option priority 3101

I just think it is too weak to use mwan3 table number here (which depends on the order interfaces are declared). I could also redo the route copy (as mwan3 already does), but it is too much of rework.

Wouldn't it be better to use "ping -I device" for IPv6 by default and allow the user to use "ping -I address" when needed (with an advice that it is outside mwan3 responsibility to match source-address to an specific device)? It could be simply a different check method (like nping).

Or, mwan3 should provide its own set of ip rules, similar to what I did, to (optionally) force outgoing traffic to use an specific routing table when source address is specified (and no mwan3 rules matched).

feckert commented 3 years ago

@luizluca :+1: thanks for the excellent explanation. Unfortunately I do not know what the best solution is here. I agree with you, I have never seen it like this before. In the master this should work after merging #13169 no matter what you specify. @aaronjg has wrapped the ping checks in IPv4 and IPv6 with a LD_PRELOAD library. So the ping does find the right interface. Therefore this would only be an interim solution for the owrt-19.07 branch.

The way I see it, you could also start there and look for a from. Would this not be the easiest solution to achieve this in owrt-19.07?

aaronjg commented 3 years ago

I have seen this issue before, and for me using ping -I <dev> also fixed the issue. The issue is resolved in #13169 since it uses LD_PRELOAD to setsockopt SO_BINDTODEVICE and bind to bind to ip address.

I would be in favor of adding an additional flag to the config to specify if ping should be by device or by ip address. I don't think we want a separate ping option (like nping), since then users would have to change it when moving to 20.x and mwan3 >= 2.10.0

feckert commented 3 years ago

I would be in favor of adding an additional flag to the config to specify if ping should be by device or by ip address. I don't think we want a separate ping option (like nping), since then users would have to change it when moving to 20.x and mwan3 >= 2.10.0

I think we should detect this automatically. If we have a from then we could use the address and if this is not set then we should use the device. I think adding a config option to use the address or the device on IPv6 is ( as I said before ) an interim solution and is not necessary on mwan3 2.10.0 version. Would this work?

luizluca commented 3 years ago

@luizluca thanks for the excellent explanation. Unfortunately I do not know what the best solution is here. I agree with you, I have never seen it like this before. In the master this should work after merging #13169 no matter what you specify. @aaronjg has wrapped the ping checks in IPv4 and IPv6 with a LD_PRELOAD library. So the ping does find the right interface. Therefore this would only be an interim solution for the owrt-19.07 branch.

The way I see it, you could also start there and look for a from. Would this not be the easiest solution to achieve this in owrt-19.07?

I did saw #13169. However, it looks like a too complex approach to fix the problem. I'm worried that it will increase too much the cost to maintain mwan3 pkgs.

The reason for all that is a way to send pkgs that would use exclusively one or other ISP connection. Isn't that exactly what the different routing tables (1, 2, 3...) created by mwan3 does? So, we need a way to select the routing table from userland. The routing table is only selectable (AFAIK) using ip-rule and it is very limited to what you can use in a filter. I guess only uid (if that uid is user id) is a controllable "variable", as long mwan3 is running as root. However, we could use the same field that mwan3 is already using: fwmark and indirectly set it using iptables mangle rules.

I did a quick search for possible candidates of "variables" that could be used to match an outgoing pkg:

  1. cgroup: if openwrt uses containers...
  2. owner: we could create users or group specific for each isp. It would be safer to use supplementary groups as root could use them without being a member and if mwan3track does use a different user, it could have those groups. newgrp is already available in openwrt (shadow-newgrp)
  3. any other idea?

Another alternative would be to use different network namespaces for each mwan3track. But I guess openwrt kernel also does not compile kernel with it.

I would use owner/group selection. So, mwan3track would:

1) create extra groups (on-demand) for each interface (mwan3int1, mwan3int2, mwan3int3...)

mwan3track:
if groups does not exist; then
   create group mwan3int$INTERFACE_ID
fi
newgrp mwan3int$INTERFACE_ID

(do normal ping, curl, etc without bothering about outgoing interface)

And we need extra iptables for matching:

iptables/ip6table -I PREROUTING -m owner --gid-owner mwan3int$INTERFACE_ID -j CONNMARK  ...(setting to a mark that will both skip mwan3 rules and match existing ip rule that selects table $INTERFACE_ID.

I don't like too much to add new groups but it is less intrusive that a wrapper around socket functions. And it is much easier to maintain. No need to bind to specific interface, set src address...

luizluca commented 3 years ago

I would be in favor of adding an additional flag to the config to specify if ping should be by device or by ip address. I don't think we want a separate ping option (like nping), since then users would have to change it when moving to 20.x and mwan3 >= 2.10.0

I think we should detect this automatically. If we have a from then we could use the address and if this is not set then we should use the device. I think adding a config option to use the address or the device on IPv6 is ( as I said before ) an interim solution and is not necessary on mwan3 2.10.0 version. Would this work?

The "from" is just one of the possible ways to do source-based routing. I'm using ip-rules to do exactly what "from" does. There is no way to autodetect all variants. You could only do some heuristic autodetection. It might need something that the user should select inside mwan3 config.

Anyway, I prefer the alternative to avoid selecting the interface using command arguments and rely on grp/iptables/rules to route the packet. It would also allow new ways to test the link quality as you'll not be constrained to use only commands that allows you to select outgoing device/source address.

aaronjg commented 3 years ago

@luizluca - Thanks for the suggestions. I agree that adding C code to mwan3 increases the complexity a bit, however I considered several options, and I think it is the cleanest. I tested the code with all of the mwan3 tracking methods, as well as some other programs including iperf3 and uclient-fetch, and it works in all cases. There should be very few changes needed to the shared library in the future.

This library also allows testing link quality with commands that do not support command line arguments to bind to the interface, and it can be accessed via mwan3 wrap <iface> <cmd> <cmd args>

I think it is still possible to do it via owner but there are a few considerations:

None of these issues are insurmountable, but there is a fair amount of complexity here. After considering the pitfalls of the owner approach and some discussion with @feckert, it seemed that the LD_PRELOAD approach was a better solution.

As far as the best solution for 19.07, it will be hard to detect automatically, as you mentioned. If we need to do automatic detection, I would suggest we use IP address when it is a tunneling interface, which seemed to be the issue that #11564 fixed, and otherwise ping by specifying the device.

luizluca commented 3 years ago

@luizluca - Thanks for the suggestions. I agree that adding C code to mwan3 increases the complexity a bit, however I considered several options, and I think it is the cleanest. I tested the code with all of the mwan3 tracking methods, as well as some other programs including iperf3 and uclient-fetch, and it works in all cases. There should be very few changes needed to the shared library in the future.

I do believe it works flawlessly. That is not my concern. I just think that, maybe, wrapping socket calls to insert new flags is too invasive. mwan3 is coming from a script-only package to something with a code injector. Imagine how hard would it be to someone outside mwan3 to debug by its own a possible bug introduced by this wrapper.

This library also allows testing link quality with commands that do not support command line arguments to bind to the interface, and it can be accessed via mwan3 wrap <iface> <cmd> <cmd args>

Sure, that is a nice feature. I do believe it is better to not rely on special arguments from some special programs to test a link. I just didn't like the word "wrap" as it tells how but not what it will do with and . How about mwan3 (via?|use?|route?) <iface> <cmd> <cmd args>?

I think it is still possible to do it via owner but there are a few considerations:

  • It would require adding/deleting users whenever interfaces are added/deleted from the mwan3 config, and removed when the mwan3 package is removed

Just be careful that root is needed for tests like ping. And it's tricky to setuid ping when it is a busybox applet. mwan3 must run as root. I don't think a user for each interface would be the best option anyway. I would go with supplementary groups. If mwan3 could run as a user, it just need to be a member of those supplementary groups. For root, it is not needed. mwan3 could add two (or four matching the example config) supplementary groups by default (mwan3int1, mwan3int2, mwan3int3...) and add more on demand. I don't think it should remove them ever. I never saw a linux package that adds a user/group remove it when it is uninstalled.

  • It adds a dependency on the iptables-owner extension

That is a good thing. It is a code that is not maintained by mwan3 (not even by openwrt). Today it would require kmod-ipt-extra, iptables-mod-extra (about 4k and 8k). However it also includes addrtype, pkttype and quota. owner is less than 20% of those packages after installed. Doing a proportional guess from 12k, I would use about 2.5k. We just need to split those extra pkgs.

  • Procd cannot start the mwan3 processes as the desired user, since they still need to access the mwan3 config, which needs root permission to read. So they would need to run the track command as another user, which I believe would add a dependency on su or newgrp (which are not in base openwrt)

I just want to change gid like this:

# id 
uid=0(root) gid=0(root) grupos=0(root)
# sg adm id
uid=0(root) gid=4(adm) grupos=4(adm),0(root)

If mwan3track is a procd instance (that is part of #13169), procd might have all we need to setgid with procd_set_param group "$group". The only drawback is that the user could not run it by hand anymore (mwan wrap ..) without another tool like newgrp/sg installed. But it could be a weak dependency, just like fw3 tells that without ipset installed you cannot use it. "mwan wrap" will look for sg and bail out when it is missing.

Is it a requirement to use only openwrt base? shadow-newgrp is in packages repo as well as mwan3. I only know newgrp (actually sg, but it is simply a missing symlink). This one is quite large (compressed 9k) and it also requires login.defs from shadow-common (5k). There is an issue with newgrp/sg is that it calls /bin/bash unconditionally. It is hardcoded from $SHELL while it was being configured, but nothing too complex to fix. Setting SHELL during configure and adding an extra symlink is an easy fix.

If I would spend time developing C code, it would be a way to set gid, like a newgrp/sg as an busybox applet. In the end, it is a "setgid(gid)" and "execve()". I'm not suggesting to do it but it could be used by other packages/projects.

  • There are some issues with fwmark on PPTP interfaces (#13195), and so this may complicate the workaround for those interfaces.

This is the main point that I can't answer. I never used mwan3 with tunnel interfaces and I might be missing something very important. If the different route table works for forwarding packets, why wouldn't it work for mwan3track tests? What it needs is just a different logic that selects that interface directly, skipping mwan3 rules. Would it work with tunnel interfaces?

My point is that mwan3 is currently (and still with the wrapper) using two different logic to do "send packet through interface X". One is selecting a different route table, used by mwan3 rules, where default route will tell what interface will be used. The other one is binding to a device or settings the source address (and hoping it will be used to select the correct output interface). It would be better to have a single codepath that would work (or break) the same way.

  • Owner rules are only valid in the OUTPUT table, not the PREROUTING table, so it can't be added to the main mwan3_hook chain, which adds a small bit of complexity to the mwan3 setup/teardown

I wouldn't count on using mwan3_hook. mangle OUTPUT is already in use by mwan3. We don't need mwan3 rules but simply the same result as if it was selected to use interface X. I guess locally generated traffic does not goes through PREROUTING but only OUTPUT. It would even make things easier.

  • Care must also be taken to make sure the correct ipv6 address is selected for the appropriate interface (perhaps not an issue because there would also likely be ipv6 masquerading to take care of this?)

Well, wrapper or not, mwan3 for IPv6 do need to eventually rewrite the source address. I offer my ISP delegated prefixes to my clients. When mwan3 selects an interface that does not match the delegated prefix, I do a NETMAP to change the source prefix to the outgoing interface prefix. I'm currently offering both delegated prefixes and also an ULA prefix. Someday, when clients begin to talk mptcp, I could skip those mptcp flows from mwan3 rules.

None of these issues are insurmountable, but there is a fair amount of complexity here. After considering the pitfalls of the owner approach and some discussion with @feckert, it seemed that the LD_PRELOAD approach was a better solution.

I'm trying hard to see where is the complexity. In summary:

Most of #13169 is still needed. Just the need for a wrapper that goes with gid.

As far as the best solution for 19.07, it will be hard to detect automatically, as you mentioned. If we need to do automatic detection, I would suggest we use IP address when it is a tunneling interface, which seemed to be the issue that #11564 fixed, and otherwise ping by specifying the device.

For 19.07, I would add an extra config that will force it to use dev instead of address. It will be used only by marginal cases and it might be removed when a real fix is done.

ptpt52 commented 3 years ago

ping -I ip not really work as expected, it is incorrect.

ping -I ip cannot make sure packet out via the device with the ip, it will re-route by mangle hook netfilter rules, which is set by mwan3

aaronjg commented 3 years ago

I believe that this https://github.com/openwrt/openwrt/pull/3587 will fix https://bugs.openwrt.org/index.php?do=details&task_id=2897, and if it does, once it is merged in, we can go back to using -I <device> in the mwan3 2.8.x without adding a config option.

jamesmacwhite commented 3 years ago

@aaronjg I believe that patch made it into 19.07.5 just as a ping on this, if the ping hacks can be removed from the 2.8 branch in 19.07

aaronjg commented 3 years ago

@jamesmacwhite I never reproduced the issue, so reluctant to make a P/R for the fix.

@brianjmurrell - you reported the issue with -I on 6in4 tunnels initially I believe. If you are running 19.07.5, could you check to see if you still have issues with ping -I on tunnels?

If the underlying issue is resolved, we can go back to using -I in mwan3 2.8.x with this branch: https://github.com/aaronjg/openwrt-packages/tree/19.07-ping-device

jamesmacwhite commented 3 years ago

I believe is it fixed, because using the mwan3 diagnostics (which never performed the hack in mwan3track) and always uses ping -I does work under 19.07.5 with no issues now.

image

Before it would just timeout. This is with just busybox ping installed, not iputils-ping.

I have updated the documentation around the OpenWrt version in the docs: https://openwrt.org/docs/guide-user/network/wan/multiwan/mwan3#openwrt_version. With the kernel patches merged into 19.07.5, I would say the logical recommendation would be to make 19.0.7.5 and above the recommended version currently. While it is possible not everyone will hit the kernel issue if they don't use tunnel protocols on older builds, there is a high chance they might.

feckert commented 3 years ago

@jamesmacwhite If this change is added we could close this issue on openwrt-19.07 https://github.com/aaronjg/openwrt-packages/commit/d6426f576195d51b1d5aa03816d5d88f33b12b59

jamesmacwhite commented 3 years ago

@feckert Confirmed. This should be resolved in the 19.07 branch from release 19.07.5 onwards. The docs have been updated to recommend OpenWrt version 19.07.5 and above for mwan3 usage.

aaronjg commented 3 years ago

@jamesmacwhite Thanks for testing

@feckert Thanks for merging this. I didn't want to stage the P/R without reproducing the problem and testing