Closed andyrichardson closed 3 years ago
Sounds like the symptoms of running out of RAM. Apart from
(30s unplugged, reboot alone won't do it)
which is extremely odd.
Please login to the router using SSH and use top
to show the resource usage. If the issue can be reproduced reliably, you should see what ever it is eating up all the RAM and then we know more.
Also the kernel and system log may provide some information, ie. either extract them from LuCI or use dmesg
and logread
via SSH.
Hey @dangowrt thanks for the quick reply. I've done some further investigation and I have a better reproduction.
I set up a WiFi access point and tried connecting to it using my Quest 2 (because it has WiFi 6). Judging by the 800 mbps link speed and what I was seeing in the LaCI ("AC"), the access point was only using 802.11ac rather than 802.11ax.
I looked at my config in /etc/config/wireless
and sure enough:
config wifi-device 'radio1'
option type 'mac80211'
option hwmode '11a'
option path '1a143000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0'
option channel 'auto'
option cell_density '0'
option htmode 'VHT80'
option txpower '20'
To my understanding, VHT80
implies 802.11ac with an 80mhz channel width. So I do some investigation and came across this. I changed the htmode to HE80
, reboot the router, spin up the Quest 2 and... boom 1.2gbps link speed 🚀
I continue to set up my router (good 10ish minutes), make some network changes, and notice that the UI is crashing. I try SSH - no dice. I try connect to the wifi on my phone - it connects fine but fails at DHCP handshaking.
At this point, I'm convinced its a recent change I made that caused the device to hang up. I reboot but no changes - I still can't connect. Only when I unplugged the router for a good few seconds was I able to connect back. But the same issue occurs again after some time.
So yeah - looks like the issue is that AX support is broken? I actually grabbed this router because it is listed on the site as having AX support so I gathered the defaulting to VHT80
was due to LaCI not yet supporting AX rather than the firmware itself.
Anywhoo, I know WiFi 6 is still pretty exotic so if there's anything I can do to help work on/test to get this working on!
Manually enabling WiFi 6 causes the following issues:
I'm guessing this is a memory leak 🤔
Yes, that definitely sounds like a memory leak. If you manage to login via SSH before the device starts thrashing and see which process (kernel or userland) is eating all that RAM, that'd be great.
Support for 11ax HE modes in OpenWrt is still incomplete. Apart from changes to LuCI, we will need support in hostapd/wpa_supplicant which is waiting for merge: https://patchwork.ozlabs.org/project/openwrt/list/?series=229531
Support for it in OpenWrt iwinfo
radio abstraction layer has been added, but not merged to openwrt.git yet:
https://git.openwrt.org/?p=project/iwinfo.git;a=commit;h=50b64a63e3945693646d1b2eb4e59e14e35cedc3
I reckon work on LuCI will only begin once those two are merged (but that should happen very soon).
Requiring a "cold" reboot as you describe could be related to MT7915 being stuck and PCIe reset not happening properly on reboot:
https://github.com/openwrt/mt76/issues/316
As those problems are supposedly bugs in OpenWrt (and not in the installer), I guess the best place to discuss them in https://forum.openwrt.org or our mailing list. This project here is just the installer.
@dangowrt you are a hero - I'm pretty fresh to this stuff so all these resources + info are really appreciated 🔥
Will get back to you with some CPU/memory deets soon.
Quick update. I'm trying to reproduce the issue and I'm getting abnormalities, but it isn't quite clear what's going on.
Memory and CPU usage looks fine for now - the first weird thing that has started happening is that WAN is no longer working ("Error: Network device is not present" in LuCI). Other than that, everything else seems fine (and I'm again seeing a 1.2Gbps on a WiFi 6 client).
Based on the links you've provided, it sounds like there's still a few moving parts that need to be put together. I realise that hostapd will be running with the default ieee80211ax=0
for the time being so I'm not totally sure what setting the htmode
is doing to get past this limiation.
I'll open a topic on the forum if you think that's the next best place to start :+1: in the meantime, I'm going to see if I can build some of these packages from source with the patches mentioned - I'm guessing the chance of bricking is low given it's all on the OS layer/unrelated to the kernel.
Edit - my bad, it is kernel related.
Sounds like you might have stumbled upon a problem with NAT-offloading related stuff in DSA. Edit: and there is probably already a fix for it, posted today: http://lists.infradead.org/pipermail/linux-mediatek/2021-April/023736.html
I've imported the fix to openwrt: https://github.com/openwrt/openwrt/commit/7f703716ae0e4cb4810eff37605b7594cef1edb8
Regarding 11ax/HE modes, you may just build OpenWrt with the linked patches and updated iwinfo sources on top and give that a shot. I'm sure @blogic and @blocktrron will appreciate any valuable feedback on the corresponding series. You don't risk much when trying to flash your own builds at this stage, as even in case of flashing completely broken firmware you still got the initramfs-recovery image. And even if you completely break UBI, you still got fall-back to TFTP.
Latest master should be fine as a base, right?
Still working on building it - I'm guessing it's normal for it to take a few hours to cross-compile.
As long as I have a recovery to boot - I'm happy going HAM
Yes, regarding the flow-offloading bug which made wan
interface blow up after a while, that's fixed for any commit after openwrt/openwrt@7f70371.
Building from source for the first time takes ages because toolchain needs to be built as well. On my quite dated machine this takes ~2h.
we hammered the e8450 under a full load stress inside a lanforge chamber with 200 stations attached for the last 48hr and it is peaking out on performance and stability. this is running a 21.02 tree with the latest mt76 driver from felix's staging tree. not seen any glitches at all.
we hammered the e8450 under a full load stress inside a chamber with 200 stations attached for the last 48hr and it is peaking out on performance and stability. this is running a 21.02 tree with the latest mt76 driver from felix's staging tree. not seen any glitches at all.
Damn that's awesome to hear it's been battle tested!
On my quite dated machine this takes ~2h.
I'm still working on getting those AX patches built. I left a master branch build going overnight and it's still going (macbook pro 2018). Maybe it went to sleep 🤷
Edit: It just finished after 12 hours - it must have gone to sleep.
Sorry to be a pain - do either of you folks have any tips for getting the iwinfo patch applied? There doesn't look to be any source files in package/network/utils/iwinfo
so quilt can't apply the patch.
Edit - nevermind - just putting the patch in package/network/utils/iwinfo/patches
did the job
An update on how the build went.
With some tweaking iperf params, I've managed to hit 819 Mbit/s down over wifi which I think is around what we should expect for AX - right?
There was something funky going on with iperf and CPU saturation - might just be iperf - will come back with updates after the weekend once I've got a better idea of how it's handling real-world usage
I am just guessing here because I am new to the OpenWRT environment but here I go...
For AX to work on E8450, we need
From the results of
root@OpenWrt:~# find /lib/modules/$(uname -r) -type f -name "*.ko"
and
root@OpenWrt:~# ls /lib/firmware/mediatek/
I am assuming that we are currently stuck at No.3 of the list above, as far as this installer goes.
(Awesome work @dangowrt , cheers)
Now, I am very interested in what @andyrichardson is talking about here.
I changed the htmode to HE80, reboot the router, spin up the Quest 2 and... boom 1.2gbps link speed 🚀
I am currently stuck at approximately 200Mbps with the following settings.
{"channel": "auto", "hwmode": "11g", "path": "platform/18000000.wmac", "cell_density": 0, "htmode": "VHT20"}
Will the different settings improve the speed(AC or faster), or do I need something else?
@yoshi3jp The installer only converts the flash layout to use UBI and load a reduced initramfs build of OpenWrt into the flash to serve as recovery OS. From there you still need to go on and install the actual OpenWrt sysupgrade image, like shown in the installation video or follow https://github.com/dangowrt/linksys-e8450-openwrt-installer#steps also after step 8. The mt76 driver is already loaded and should support HE rates on MT7915E. Support for 802.11ax in hostapd, iwinfo and LuCI are pending, so until they get merged I will not include them in the installer image.
Thank you @dangowrt ! As you suggested, I took the following steps and the 5GHz antenna came up and running.
kmod-mt7915e
Once I saw radio1 up in luci,
Save & Apply
button# uci set wireless.radio1.htmode='HE80'
# /etc/init.d/network restart
This procedure has provided me with a 700Mbps+ internet connection. The macOS also reports 802.11ac connection. Thank you guys again!
@yoshi3jp I did not suggest to install kmod-mt7915e
manually as it already comes with the image for that device. It should not be needed to do anything like that manually. If so, I'd consider that a bug in OpenWrt.
A quick thing to note - even with the hostapd patch, it looks like spatial streaming isn't currently supported so theoretically, AC might be faster for now.
use "iw dev wlanX station dump" the iwinfo tool most likely does not support mcs/nss for HE rates yet. I just checked and my android is connected using spatial streams on a HE only MCS.
Thanks @blogic
I'm seeing this on my end
tx bitrate: 1200.9 MBit/s 80MHz HE-MCS 11 HE-NSS 2 HE-GI 0 HE-DCM 0
rx bitrate: 1200.9 MBit/s 80MHz HE-MCS 11 HE-NSS 2 HE-GI 0 HE-DCM 0
I'm going to close this for now because, as @dangowrt mentioned, this isn't really related to this repo.
I'll see if I can find a good place on the forum to keep this conversation going and post it here 👍
Thanks everyone for your contributions!
@andyrichardson Thanks for testing and reporting! Looking forward to have you around in the forums :)
One of the warmest welcomes I've had to an Open Source project - keep up the good work 🚀
https://forum.openwrt.org/t/belkin-rt3200-linksys-e8450-wifi-ax-discussion/94302
FTR I have pending patches for rpcd as well as LuCI, however the current state of iwinfo does not work, as the message parsing has to refactored to use split messages since the latest mac80211, hence why i didn't push the update to the OpenWrt repo yet.
I hope i can get this done in the next 1 - 2 weeks.
im wondering could someone help me. iv spent 2 days trying to fix my routher after a failed upgrade. i used the web gui to upgrade my router to the latest firmware which worked but the wifi wouldn't so i pessed the and held the reset button and now im stuck in an endless recovery mode. iv tried various files from openwrt and nothing
Hey there - thanks for taking the time to put this together!
Edit: Initial post was misleading, see this comment for issue summary.
Initial post
I just followed the tutorial and managed to flash 0.4 on my RT3200. After a few hours of using it however, I've noticed that things are very unstable - notably: - Network speed maxes out at 20mbit/s (wired and WiFi) - LuCI becomes unresponsive after ~30 minutes - requires reset (30s unplugged, reboot alone won't do it) - DHCP sometimes just stops working - SSH sometimes doesn't accept connections/times out So yeah - I'm wondering whether others are finding this to be unusable also or whether I'm doing something wrong.