dangowrt / owrt-ubi-installer

OpenWrt firmware installer for the Linksys E8450 aka. Belkin RT3200
GNU General Public License v2.0
395 stars 50 forks source link

WiFi 6 (AX) support #7

Closed andyrichardson closed 3 years ago

andyrichardson commented 3 years ago

Hey there - thanks for taking the time to put this together!

Edit: Initial post was misleading, see this comment for issue summary.

Initial post I just followed the tutorial and managed to flash 0.4 on my RT3200. After a few hours of using it however, I've noticed that things are very unstable - notably: - Network speed maxes out at 20mbit/s (wired and WiFi) - LuCI becomes unresponsive after ~30 minutes - requires reset (30s unplugged, reboot alone won't do it) - DHCP sometimes just stops working - SSH sometimes doesn't accept connections/times out So yeah - I'm wondering whether others are finding this to be unusable also or whether I'm doing something wrong.
dangowrt commented 3 years ago

Sounds like the symptoms of running out of RAM. Apart from

(30s unplugged, reboot alone won't do it)

which is extremely odd.

Please login to the router using SSH and use top to show the resource usage. If the issue can be reproduced reliably, you should see what ever it is eating up all the RAM and then we know more. Also the kernel and system log may provide some information, ie. either extract them from LuCI or use dmesg and logread via SSH.

andyrichardson commented 3 years ago

Hey @dangowrt thanks for the quick reply. I've done some further investigation and I have a better reproduction.

I set up a WiFi access point and tried connecting to it using my Quest 2 (because it has WiFi 6). Judging by the 800 mbps link speed and what I was seeing in the LaCI ("AC"), the access point was only using 802.11ac rather than 802.11ax.

I looked at my config in /etc/config/wireless and sure enough:

config wifi-device 'radio1'
        option type 'mac80211'
        option hwmode '11a'
        option path '1a143000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0'
        option channel 'auto'
        option cell_density '0'
        option htmode 'VHT80'
        option txpower '20'

To my understanding, VHT80 implies 802.11ac with an 80mhz channel width. So I do some investigation and came across this. I changed the htmode to HE80, reboot the router, spin up the Quest 2 and... boom 1.2gbps link speed 🚀

I continue to set up my router (good 10ish minutes), make some network changes, and notice that the UI is crashing. I try SSH - no dice. I try connect to the wifi on my phone - it connects fine but fails at DHCP handshaking.

At this point, I'm convinced its a recent change I made that caused the device to hang up. I reboot but no changes - I still can't connect. Only when I unplugged the router for a good few seconds was I able to connect back. But the same issue occurs again after some time.

So yeah - looks like the issue is that AX support is broken? I actually grabbed this router because it is listed on the site as having AX support so I gathered the defaulting to VHT80 was due to LaCI not yet supporting AX rather than the firmware itself.

Anywhoo, I know WiFi 6 is still pretty exotic so if there's anything I can do to help work on/test to get this working on!

TL;DR

Manually enabling WiFi 6 causes the following issues:

I'm guessing this is a memory leak 🤔

dangowrt commented 3 years ago

Yes, that definitely sounds like a memory leak. If you manage to login via SSH before the device starts thrashing and see which process (kernel or userland) is eating all that RAM, that'd be great.

Support for 11ax HE modes in OpenWrt is still incomplete. Apart from changes to LuCI, we will need support in hostapd/wpa_supplicant which is waiting for merge: https://patchwork.ozlabs.org/project/openwrt/list/?series=229531

Support for it in OpenWrt iwinfo radio abstraction layer has been added, but not merged to openwrt.git yet: https://git.openwrt.org/?p=project/iwinfo.git;a=commit;h=50b64a63e3945693646d1b2eb4e59e14e35cedc3

I reckon work on LuCI will only begin once those two are merged (but that should happen very soon).

Requiring a "cold" reboot as you describe could be related to MT7915 being stuck and PCIe reset not happening properly on reboot:

https://github.com/openwrt/mt76/issues/316

As those problems are supposedly bugs in OpenWrt (and not in the installer), I guess the best place to discuss them in https://forum.openwrt.org or our mailing list. This project here is just the installer.

andyrichardson commented 3 years ago

@dangowrt you are a hero - I'm pretty fresh to this stuff so all these resources + info are really appreciated 🔥

Will get back to you with some CPU/memory deets soon.

andyrichardson commented 3 years ago

Quick update. I'm trying to reproduce the issue and I'm getting abnormalities, but it isn't quite clear what's going on.

Memory and CPU usage looks fine for now - the first weird thing that has started happening is that WAN is no longer working ("Error: Network device is not present" in LuCI). Other than that, everything else seems fine (and I'm again seeing a 1.2Gbps on a WiFi 6 client).

Based on the links you've provided, it sounds like there's still a few moving parts that need to be put together. I realise that hostapd will be running with the default ieee80211ax=0 for the time being so I'm not totally sure what setting the htmode is doing to get past this limiation.

I'll open a topic on the forum if you think that's the next best place to start :+1: in the meantime, I'm going to see if I can build some of these packages from source with the patches mentioned - I'm guessing the chance of bricking is low given it's all on the OS layer/unrelated to the kernel.

Edit - my bad, it is kernel related.

dangowrt commented 3 years ago

Sounds like you might have stumbled upon a problem with NAT-offloading related stuff in DSA. Edit: and there is probably already a fix for it, posted today: http://lists.infradead.org/pipermail/linux-mediatek/2021-April/023736.html

I've imported the fix to openwrt: https://github.com/openwrt/openwrt/commit/7f703716ae0e4cb4810eff37605b7594cef1edb8

Regarding 11ax/HE modes, you may just build OpenWrt with the linked patches and updated iwinfo sources on top and give that a shot. I'm sure @blogic and @blocktrron will appreciate any valuable feedback on the corresponding series. You don't risk much when trying to flash your own builds at this stage, as even in case of flashing completely broken firmware you still got the initramfs-recovery image. And even if you completely break UBI, you still got fall-back to TFTP.

andyrichardson commented 3 years ago

Latest master should be fine as a base, right?

Still working on building it - I'm guessing it's normal for it to take a few hours to cross-compile.

As long as I have a recovery to boot - I'm happy going HAM

dangowrt commented 3 years ago

Yes, regarding the flow-offloading bug which made wan interface blow up after a while, that's fixed for any commit after openwrt/openwrt@7f70371. Building from source for the first time takes ages because toolchain needs to be built as well. On my quite dated machine this takes ~2h.

blogic commented 3 years ago

we hammered the e8450 under a full load stress inside a lanforge chamber with 200 stations attached for the last 48hr and it is peaking out on performance and stability. this is running a 21.02 tree with the latest mt76 driver from felix's staging tree. not seen any glitches at all.

andyrichardson commented 3 years ago

we hammered the e8450 under a full load stress inside a chamber with 200 stations attached for the last 48hr and it is peaking out on performance and stability. this is running a 21.02 tree with the latest mt76 driver from felix's staging tree. not seen any glitches at all.

Damn that's awesome to hear it's been battle tested!

On my quite dated machine this takes ~2h.

I'm still working on getting those AX patches built. I left a master branch build going overnight and it's still going (macbook pro 2018). Maybe it went to sleep 🤷

Edit: It just finished after 12 hours - it must have gone to sleep.

andyrichardson commented 3 years ago

Sorry to be a pain - do either of you folks have any tips for getting the iwinfo patch applied? There doesn't look to be any source files in package/network/utils/iwinfo so quilt can't apply the patch.

Edit - nevermind - just putting the patch in package/network/utils/iwinfo/patches did the job

andyrichardson commented 3 years ago

An update on how the build went.

Ignore this The good news is I managed to build the firmware from source and get it flashed. The bad news is it doesn't look to be working (for now). > The built image [is here](https://github.com/dangowrt/linksys-e8450-openwrt-installer/files/6327607/Archive.zip) if anyone wants to try it. The link speed according to Android (Oculus Quest 2) is 1200 Mbit/s transmit and 1200 Mbit/s receive which should be Wifi 6/AX. I didn't quite trust it so I spun up iperf and here are the results: ``` [ ID] Interval Transfer Bandwidth [ 6] local 192.168.1.1 port 5001 connected with 192.168.1.104 port 49864 [ 6] 0.0-10.0 sec 629 MBytes 528 Mbits/sec [ 7] local 192.168.1.1 port 5001 connected with 192.168.1.104 port 49866 [ 7] 0.0-10.2 sec 641 MBytes 527 Mbits/sec [ 8] local 192.168.1.1 port 5001 connected with 192.168.1.104 port 49868 [ 8] 0.0-10.2 sec 673 MBytes 553 Mbits/sec [ 9] local 192.168.1.1 port 5001 connected with 192.168.1.104 port 49870 [ 9] 0.0-10.1 sec 641 MBytes 532 Mbits/sec [ 10] local 192.168.1.1 port 5001 connected with 192.168.1.104 port 49874 [ 10] 0.0-10.0 sec 611 MBytes 511 Mbits/sec [ 11] local 192.168.1.1 port 5001 connected with 192.168.1.104 port 49876 [ 11] 0.0-10.2 sec 507 MBytes 416 Mbits/sec ``` For comparison, my macbook, which doesn't support AX, was getting faster speeds ``` [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 842 MBytes 704 Mbits/sec [ 5] local 192.168.1.1 port 5001 connected with 192.168.1.104 port 49808 [ 5] 0.0-10.0 sec 868 MBytes 727 Mbits/sec ``` I'm going to hazard a guess and say that when trying to use AX, something is falling back to 802.11n speeds (theoretical max of ~600Mbit looks like a match). I'm stumped as to why. The link speed reported on Android leads me to think that the radio is operating in AX mode - but that wouldn't explain the slow transfer rate. Almost forgot - no weird NAT issues this time around. I guess that flow-offloading fix worked :+1:

With some tweaking iperf params, I've managed to hit 819 Mbit/s down over wifi which I think is around what we should expect for AX - right?

There was something funky going on with iperf and CPU saturation - might just be iperf - will come back with updates after the weekend once I've got a better idea of how it's handling real-world usage

yoshi3jp commented 3 years ago

I am just guessing here because I am new to the OpenWRT environment but here I go...

For AX to work on E8450, we need

  1. A working MediaTek MT7915 driver.
  2. Turn it into a kernel module.
  3. Load it into the kernel on boot.
  4. Set it up as "radio1", with the "htmode" which is yet to be supported by luci.

From the results of root@OpenWrt:~# find /lib/modules/$(uname -r) -type f -name "*.ko" and root@OpenWrt:~# ls /lib/firmware/mediatek/ I am assuming that we are currently stuck at No.3 of the list above, as far as this installer goes. (Awesome work @dangowrt , cheers) Now, I am very interested in what @andyrichardson is talking about here.

I changed the htmode to HE80, reboot the router, spin up the Quest 2 and... boom 1.2gbps link speed 🚀

I am currently stuck at approximately 200Mbps with the following settings. {"channel": "auto", "hwmode": "11g", "path": "platform/18000000.wmac", "cell_density": 0, "htmode": "VHT20"} Will the different settings improve the speed(AC or faster), or do I need something else?

dangowrt commented 3 years ago

@yoshi3jp The installer only converts the flash layout to use UBI and load a reduced initramfs build of OpenWrt into the flash to serve as recovery OS. From there you still need to go on and install the actual OpenWrt sysupgrade image, like shown in the installation video or follow https://github.com/dangowrt/linksys-e8450-openwrt-installer#steps also after step 8. The mt76 driver is already loaded and should support HE rates on MT7915E. Support for 802.11ax in hostapd, iwinfo and LuCI are pending, so until they get merged I will not include them in the installer image.

yoshi3jp commented 3 years ago

Thank you @dangowrt ! As you suggested, I took the following steps and the 5GHz antenna came up and running.

  1. Install the current snapshot version of OpenWRT (SNAPSHOT r16539-28623cab32)
  2. Install kmod-mt7915e

Once I saw radio1 up in luci,

  1. Configure SSID and Encryption
  2. Hit the Save & Apply button
  3. ssh into the router
  4. # uci set wireless.radio1.htmode='HE80'
  5. # /etc/init.d/network restart

This procedure has provided me with a 700Mbps+ internet connection. The macOS also reports 802.11ac connection. Thank you guys again!

dangowrt commented 3 years ago

@yoshi3jp I did not suggest to install kmod-mt7915e manually as it already comes with the image for that device. It should not be needed to do anything like that manually. If so, I'd consider that a bug in OpenWrt.

andyrichardson commented 3 years ago

A quick thing to note - even with the hostapd patch, it looks like spatial streaming isn't currently supported so theoretically, AC might be faster for now.

Screenshot 2021-04-17 at 17 59 35
blogic commented 3 years ago

use "iw dev wlanX station dump" the iwinfo tool most likely does not support mcs/nss for HE rates yet. I just checked and my android is connected using spatial streams on a HE only MCS.

andyrichardson commented 3 years ago

Thanks @blogic

I'm seeing this on my end

tx bitrate:     1200.9 MBit/s 80MHz HE-MCS 11 HE-NSS 2 HE-GI 0 HE-DCM 0  
rx bitrate:     1200.9 MBit/s 80MHz HE-MCS 11 HE-NSS 2 HE-GI 0 HE-DCM 0
andyrichardson commented 3 years ago

I'm going to close this for now because, as @dangowrt mentioned, this isn't really related to this repo.

I'll see if I can find a good place on the forum to keep this conversation going and post it here 👍

Thanks everyone for your contributions!

dangowrt commented 3 years ago

@andyrichardson Thanks for testing and reporting! Looking forward to have you around in the forums :)

andyrichardson commented 3 years ago

One of the warmest welcomes I've had to an Open Source project - keep up the good work 🚀

https://forum.openwrt.org/t/belkin-rt3200-linksys-e8450-wifi-ax-discussion/94302

blocktrron commented 3 years ago

FTR I have pending patches for rpcd as well as LuCI, however the current state of iwinfo does not work, as the message parsing has to refactored to use split messages since the latest mac80211, hence why i didn't push the update to the OpenWrt repo yet.

I hope i can get this done in the next 1 - 2 weeks.

damo901 commented 1 year ago

im wondering could someone help me. iv spent 2 days trying to fix my routher after a failed upgrade. i used the web gui to upgrade my router to the latest firmware which worked but the wifi wouldn't so i pessed the and held the reset button and now im stuck in an endless recovery mode. iv tried various files from openwrt and nothing