WireGuard / wireguard-vyatta-ubnt

WireGuard for Ubiquiti Devices
https://www.wireguard.com/
GNU General Public License v3.0
1.46k stars 70 forks source link

Connecting to wireguard on edgerouter messes up outgoing UDP packets #23

Open virusmoere opened 4 years ago

virusmoere commented 4 years ago

Please reopen Lochnair/vyatta-wireguard#98 on this repo. The issues still exists and is reproducible by my three ERX routers.

Edit: adding info from old issue If you use a Mediatek device with hwnat your UDP packages might get lost. Currently the only solution is to disable hwnat

FossoresLP commented 4 years ago

@virusmoere You basically just did, we cannot easily copy the issue over AFAIK so your reference here is the best we can do.

I'm not sure I'm qualified to fix this issue and I cannot test anything since I only have Octeon hardware. This will most likely need to be solved by someone with more experience with EdgeOS and a Mediatek device. PRs are always welcome.

Amoeba00 commented 4 years ago

It's my understanding that the UDP re-order problem only affects the 2.x firmware and was fixed in v1.10.x (latest being 1.10.11). If so, then it's not really a wireguard problem, is it? Though maybe a mention in the docs might be useful given that wireguard uses UDP.

virusmoere commented 4 years ago

Hi,

FYI according to the Release notes of my current EdgeOS Version 2.0.9 Beta 3 this is only related to cavium platform and not to mediatek:

[Offloading] - On Cavium-based routers (ER, ER-Pro, ER-Lite, ER-PoE, ER-4, ER-6P, ER-12, ER-Infinity) small percentage of UDP packets are randomly reordered. This issue was fixed in v1.10.0 firmware but it reappeared since v2.0.0 because of new Ethernet driver. We are in contact with SoC vendor to fix this issue.

Amoeba00 commented 4 years ago

FYI according to the Release notes of my current EdgeOS Version 2.0.9 Beta 3 this is only related to cavium platform and not to mediatek:

Whoops - I missed that detail, didn't I? So, hwnat disabled and v2x firmware = wireguard works fine, ERx performance problems everywhere else? hwnat enabled and v2x firmware = wireguard packet problems, ERx performance works great?

same with 1x firmware?

virusmoere commented 4 years ago

I have a spare ERX ready in a few days for testing with the v1 branch.

Lochnair commented 4 years ago

While it's true that hwnat won't work with WireGuard it should not break your connections like this.

In the early days of vyatta-wireguard we had issues with Cavium's offload where WireGuard flows were erroneously offloaded, causing the connection to fail, but I can't remember the details.

This was worked around by nuking the offload info field in the skb, causing the offload engine to ignore it. I assume we have to do something similar for hwnat, but I'm baffled by the fact that this hasn't been reported before AFAIK.

I tried looking into a way to tell hwnat to ignore the WireGuard flows, but I couldn't really figure out how hwnat actually works.

There's probably some clues in here somewhere: https://github.com/Lochnair/kernel_e50/tree/v2.0.8-hotfix.1/master/net/nat/hw_nat

@Amoeba00, @virusmoere: After Ubiquiti ported the HWNAT engine to the v2 branch, you'll likely see the same behavior on both, but it'd be good to confirm.

virusmoere commented 4 years ago

Hi, this issue is reproducible on both 1.x and 2.x branch with hwnat enabled as you guessed.

eccgecko commented 4 years ago

I can also report this issue is present on my ER-X, running the latest fw release from UBNT, 2.0.8-hotfix.1, with the latest WireGuard 1.0.20200611 deb package.

With hwnat disabled, the wg0 interface works great and the ER-X routes all my internet traffic out of it just fine, although CPU has much more overhead.

As soon as I enable hwnat, I start seeing problems, but only in certain scenarios, not all. For example, with hwnat disabled, I can use OpenVPN as a client on a local machine. Thus that OpenVPN connection gets routed out through the wg interface first, then on to server. The OpenVPN server shows the endpoint IP of the server ER-X wg is connected to as the OpenVPN client’s IP, not my ISP IP (what I want). As soon as I enable hwnat, this breaks. I can still make the initial outgoing connection and bring up the OpenVPN tunnel, but packets get dropped so that OpenVPN through the wg interface is unusable with hwnat enabled.

Also noticed Apple FaceTime is broken when hwnat is enabled with wg interface. Lots of disconnects and moments of me hearing them but them not hearing me. Again, disabling hwnat fixes it instantly, but again, at the cost of CPU.

Not sure whose remit this lies under, UBNT’s or WireGuard’s. Have reported in the UBNT forums already, and my attention was brought to this Issue, so thought I’d chime in here too.

ahua92 commented 4 years ago

I had issues with 2.0.8-hotfix.1 constantly crashing out of nowhere requiring either power cycling or a hard reset to get going again. I also had issues with streaming services on 1.10.10 and 2.0.8.

It seems like the problem is resolved with using 1.10.10, disabling HW offloading, and turning on Smart QoS, but UI Community is suggesting that my problem is more likely due to my low upload speed (5-6 Mbps).

ralban commented 4 years ago

Hey folks- Can you suggest test cases to reliably reproduce this symptom? I'm running a ER-X-SFP (MediaTek) v2.0.9 with Wireguard 1.0.20201112 (E50-v2) and hwnat enabled for IPv4. My use-case is "road-warrior;" connecting to the ERX as a server from my mobile devices to gain access to my internal LAN resources and to emerge on to the Internet from my home internet ISP. I am not (yet) using a Wireguard interface to tunnel all traffic to another host or VPN service. I'm trying to determine if this symptom is a show-stopper and I need to keep my raspberry-pi Wireguard peer online.

eng3 commented 4 years ago

Hey folks- Can you suggest test cases to reliably reproduce this symptom? I'm running a ER-X-SFP (MediaTek) v2.0.9 with Wireguard 1.0.20201112 (E50-v2) and hwnat enabled for IPv4. My use-case is "road-warrior;" connecting to the ERX as a server from my mobile devices to gain access to my internal LAN resources and to emerge on to the Internet from my home internet ISP. I am not (yet) using a Wireguard interface to tunnel all traffic to another host or VPN service. I'm trying to determine if this symptom is a show-stopper and I need to keep my raspberry-pi Wireguard peer online.

Just turn try transferring a file or iperf or anything with and without offload hwnat enabled. With offload enabled, you'll get gigabit speeds on your internal network but the VPN tunnel will be extremely slow. With offload disabled, wireguard will work fine, but your router will be limited to something like 25-50% gigabit speed. If as a whole, nothing on your network requires faster than 25-50% gigabit speed, then it should not be a showstopper. If you require faster speeds, and you require your wireguard interface to transfer more than tiny amounts of data, then I'd say it's a show stopper.

danielschonfeld commented 2 years ago

I just arrived at this problem with an ER-X and trying to use NoMachine which relies on UDP packets streamed. I wouldn't have though it's the wiregaurd tunnel though (i am indeed using Wireguard) but with hwnat enabled, i stop seeing the UDP packets on the tcpdump of the switch0 (NOT the wgX as one would expect) interface. Maybe this helps finding the root cause?

j6b72 commented 1 year ago

I can confirm that this is still the case on the latest firmware v2.0.9-hotfix.6, with hwnat enabled. Disabling hwnat successfully works around this issue.

myde2001 commented 1 year ago

Does this bug only happen on mediatek based routers? I'm thinking on switching to the EdgeRouter 4 from the EdgeRouter X

fransking commented 1 year ago

Given that features such as flow accounting are incompatible with offload does anyone know if simply enabling that on the WireGuard interfaces bypasses this issue?

set system flow-accounting interface wg0

I just tried and YouTube worked with hwnat showing as enabled but I'm not sure if this disabled hwnat on all packets not just those egressing wg0.

fransking commented 1 year ago

Coming back to this a few months later and leaving hwnat enabled and setting flow-accounting on the wireguard interfaces leaves me with ~800-900 Mb/s up and down on normal traffic and 100-140 Mb/s through wireguard. So it appears that the offload is still working for non-wireguard traffic.

myde2001 commented 1 year ago

Coming back to this a few months later and leaving hwnat enabled and setting flow-accounting on the wireguard interfaces leaves me with ~800-900 Mb/s up and down on normal traffic and 100-140 Mb/s through wireguard. So it appears that the offload is still working for non-wireguard traffic.

And does it work without any problem? only with the downside of slower speed?

fransking commented 1 year ago

UDP traffic through the wireguard interfaces is working normally (no issues with facetime, quic on youtube etc) and the non wireguard traffic looks to be being offloaded based on 800-900 upload / download speeds I was expecting. If offload was not working I would expect something in the 400-500 Mb/s range.

So I don't see a downside for my use case given that the wireguard traffic speed is going to be CPU limited anyway.

myde2001 commented 1 year ago

UDP traffic through the wireguard interfaces is working normally (no issues with facetime, quic on youtube etc) and the non wireguard traffic looks to be being offloaded based on 800-900 upload / download speeds I was expecting. If offload was not working I would expect something in the 400-500 Mb/s range.

So I don't see a downside for my use case given that the wireguard traffic speed is going to be CPU limited anyway.

That's a great achievement! How do you disable then the offload on the wg0 interface? Thanks for the help

fransking commented 1 year ago

Login to er-x console via ssh

then

configure <enter> set system flow-accounting interface wg0 <enter> commit <enter> save <enter>

My theory is that this works because flow accounting is not compatible with offloading anyway and so this marks packets egressing / ingressing the wg0 interface to bypass the offload engine. hwnat can then be left enabled and applies to packets egress / ingressing eth0 / switch0 etc.

danielschonfeld commented 1 year ago

Login to er-x console via ssh

then

configure <enter> set system flow-accounting interface wg0 <enter> commit <enter> save <enter>

My theory is that this works because flow accounting is not compatible with offloading anyway and so this marks packets egressing / ingressing the wg0 interface to bypass the offload engine. hwnat can then be left enabled and applies to packets egress / ingressing eth0 / switch0 etc.

Works great here!