Open danielkza opened 6 years ago
Hi,
What were you running before that was stable?
Have you tried changing the rekeying interval to 86400?
Have you tried to disable sleep mode on the clients?
I haven't seen your specific error message, but have not experienced anything really happy and stable (on trunk mwlwifi and trunk openwrt), compared to the tp link 4320 that was it's predecessor.
Cheers,
Damien
Hello @dmascord, thank you helping out.
What were you running before that was stable?
LEDE 17.01.2.
Have you tried changing the rekeying interval to 86400?
Not yet, but given I observe very high latency spikes in periods much shorter than the default rekey interval (600 seconds), I would guess it's not the culprit. The kernel warnings also don't seem related to rekeying (I've seen them with intervals of minutes to hours, without any discernible pattern).
Have you tried to disable sleep mode on the clients?
There are 80+ clients at peak times, including laptops and smartphones of many different manufacturers, so I don't have control over all of them. Those exact same clients were working much better before the firmware upgrade.
At first I suspected the issue was related to enabling 802.11r, but after disabling it I did not notice any improvement. I also tried disabling AMSDU as suggest in #207, but it did not improve things.
Does the problem only happen on high number of clients ?
Are you able to test with the latest trunk openwrt + latest trunk mwlwifi ?
Give a try to OpenWrt 18.06.1 as it includes a more recent version of mwlwifi.
I have a 1900ACS and started getting this message after flashing with the 2018-08-10 driver recently merged to openwrt trunk:
`kern.err kernel: [48650.605939] ieee80211 phy1: cmd 0x9128=SetSpectrumMgmt timed out
kern.err kernel: [48650.611993] ieee80211 phy1: return code: 0x9101
kern.err kernel: [48650.616560] ieee80211 phy1: timeout: 0x1128
daemon.notice hostapd: nl80211: nl80211_recv_beacons->nl_recvmsgs failed: -5
daemon.notice hostapd: nl80211: nl80211_recv_beacons->nl_recvmsgs failed: -5`
@b-h-l-c I suggest mentioning a specific driver version. "latest driver" is hard to track down once time passes.
@b-h-l-c try with 18.06.1 It comes with 2018-06-15. You could test it to see if it works and then you could install over https://github.com/eduperez/mwlwifi_LEDE/releases eduperez binaries that nowadays are 2018-08-10, and see if the problem happens again with the openwrt's stable version and same mwlwifi version that give you problems.
My parents have a WRT1900ACv2 : [ 0.000000] OF: fdt: Machine model: Linksys WRT1900ACv2
That was running a ~2015 vintage compile and I had set up a script that checked dmesg every 5 minutes for a wireless timeout and re-started the wireless. Since upgrading to 18.06.1 with driver 10.3.8.0-20180615 a couple of weeks ago I haven't seen so much as a blip (which is so very nice after 3 years of annoying "I've re-started the wireless" log e-mails).
Similar story here. Running OpenWrt 18.06.1 r7258-5eb055306f
on Linksys WRT1900ACS
and getting occasional instability with kernel logs showing:
[916024.689469] ieee80211 phy0: cmd 0x9128=SetSpectrumMgmt timed out
[916024.695608] ieee80211 phy0: return code: 0x9101
[916024.700261] ieee80211 phy0: timeout: 0x1128
I'm seeing this as well on Linksys WRT1900ACS
using OpenWrt 18.06.1 r7258-5eb055306f
. It doesn't seem to be related to load as it happened everyone is asleep and there's only ever 2 or 3 clients.
It begins with this error and then to timeouts in other commands.
[337831.240918] ieee80211 phy0: cmd 0x9125=BAStream timed out
Does anyone know of a way to recover from this without rebooting? I tried to reset the pci devices by doing echo 1 > /sys/module/mwlwifi/drivers/pci\:mwlwifi/0000\:02\:00.0/remove
. But that hung and eventually crashed the box.
Please check to check 10.3.8.0-20181029.
Not sure if this is related. Can post separately if so. But have been getting a bunch of kernel errors on the latest build 10.3.8.0-20181029. I have the the WRT1900ACv2
Wed Oct 31 14:36:38 2018 daemon.notice hostapd: wlan1: AP-STA-DISCONNECTED 08:e6:89:94:8f:ce Wed Oct 31 14:36:38 2018 daemon.info hostapd: wlan1: STA 08:e6:89:94:8f:ce IEEE 802.11: disassociated due to inactivity Wed Oct 31 14:36:42 2018 kern.err kernel: [148286.607872] ieee80211 phy1: cmd 0x9122=UpdateEncryption timed out Wed Oct 31 14:36:42 2018 kern.err kernel: [148286.614084] ieee80211 phy1: return code: 0x1122 Wed Oct 31 14:36:42 2018 kern.err kernel: [148286.618740] ieee80211 phy1: timeout: 0x1122 Wed Oct 31 14:36:42 2018 kern.err kernel: [148286.623036] wlan1: failed to remove key (0, 08:e6:89:94:8f:ce) from hardware (-5) Wed Oct 31 14:36:42 2018 daemon.info hostapd: wlan1: STA 08:e6:89:94:8f:ce IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE) Wed Oct 31 14:36:42 2018 kern.debug kernel: [148286.631066] ieee80211 phy1: MACREG_REG_INT_CODE: 0x0000 Wed Oct 31 14:36:46 2018 kern.err kernel: [148290.631780] ieee80211 phy1: cmd 0x9111=SetNewStation timed out Wed Oct 31 14:36:46 2018 kern.err kernel: [148290.637743] ieee80211 phy1: return code: 0x1111 Wed Oct 31 14:36:46 2018 kern.err kernel: [148290.642378] ieee80211 phy1: timeout: 0x1111 Wed Oct 31 14:37:48 2018 daemon.notice hostapd: wlan1: AP-STA-DISCONNECTED 5c:1d:d9:d2:a0:47 Wed Oct 31 14:37:48 2018 daemon.info hostapd: wlan1: STA 5c:1d:d9:d2:a0:47 IEEE 802.11: disassociated due to inactivity Wed Oct 31 14:37:48 2018 kern.debug kernel: [148352.174564] ieee80211 phy1: MACREG_REG_INT_CODE: 0x0000 Wed Oct 31 14:37:52 2018 kern.err kernel: [148356.174303] ieee80211 phy1: cmd 0x9122=UpdateEncryption timed out Wed Oct 31 14:37:52 2018 kern.err kernel: [148356.180525] ieee80211 phy1: return code: 0x1122 Wed Oct 31 14:37:52 2018 kern.err kernel: [148356.185179] ieee80211 phy1: timeout: 0x1122 Wed Oct 31 14:37:52 2018 kern.err kernel: [148356.189474] wlan1: failed to remove key (0, 5c:1d:d9:d2:a0:47) from hardware (-5)
Hi which router and what is the client?
O sorry my bad. wrt1900ac-v2.
I should say that when this happened three different devices lost their sessions. Laptops and apple mobile. Wifi “appeared” to still be up but couldn’t reach / ping The device. Connectivity resumed after a couple minutes
Hi, I'm also running into the exact same problem. Was this resolved by installing the 10.3.8.0-20181029 version?
On my WRT1900ACS with the Intel 8260, the November (and December) drivers resulted in an unusable 5Ghz connection. Maximum throughput was about 2Mbps (with peaks to 5-10Mbps), but as more apps get launched the connection will almost be completely unusable with not even pings going out. Opening web pages would then become an impossible task since most will timeout with either DNS resolution errors or just the request timing out completely.
Have reverted back to the driver dating 2018-10-29 and I am back to getting 300+Mbps on my 5Ghz... On dmesg I am getting quite a lot of messages regarding Start BA with my MAC address on it, so I am assuming this problem has something to do with block ACKs not completing properly or something the power saves not being implemented properly (since there are changes to U-APSD in the recent code changes)...
This is the commit I am on for the Makefile that does not have this issue:
PKG_SOURCE_URL:=https://github.com/kaloz/mwlwifi PKG_SOURCE_PROTO:=git PKG_SOURCE_DATE:=2018-10-29 PKG_SOURCE_VERSION:=382700ce5744fe80271c57a89c6589e767d91620 PKG_MIRROR_HASH:=7378b7d391aeec7a86b6548723911d21bc780ea84ceea858e910cc65a1d925c6
:~# opkg list | grep mwlwifi kmod-mwlwifi - 4.14.88+2018-10-29-382700ce-1 mwlwifi-firmware-88w8864 - 2018-10-29-382700ce-1
This is still happening as of the latest heads/openwrt-18.06 (6e16dd1) and the latest mwlwifi driver version at c1345bb on a WRT1900ACS:
root@OpenWrt:~# cat /etc/openwrt_*
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='18.06-SNAPSHOT'
DISTRIB_REVISION='r7661-6e16dd1234'
DISTRIB_TARGET='mvebu/cortexa9'
DISTRIB_ARCH='arm_cortex-a9_vfpv3'
DISTRIB_DESCRIPTION='OpenWrt 18.06-SNAPSHOT r7661-6e16dd1234'
DISTRIB_TAINTS='no-all'
r7661-6e16dd1234
root@OpenWrt:~# uname -a
Linux OpenWrt 4.14.94 #0 SMP Fri Jan 25 22:50:49 2019 armv7l GNU/Linux
root@OpenWrt:~# opkg list-installed | grep mwl
kmod-mwlwifi - 4.14.94+2018-12-10-c1345bb1-1
mwlwifi-firmware-88w8864 - 2018-12-10-c1345bb1-1
root@OpenWrt:~# opkg list-installed | grep kerne
kernel - 4.14.94-1-5086a162caff1cf7b1e225a117242df5
Wifi is unusable with either 2.4 Ghz or 5 Ghz. Here are the radio configs:
config wifi-device 'radio0'
option type 'mac80211'
option channel '36'
option hwmode '11a'
option path 'soc/soc:pcie/pci0000:00/0000:00:01.0/0000:01:00.0'
option htmode 'VHT80'
option disabled '0'
option country 'GB'
config wifi-device 'radio1'
option type 'mac80211'
option channel '11'
option hwmode '11n'
option path 'soc/soc:pcie/pci0000:00/0000:00:02.0/0000:02:00.0'
option htmode 'HT40-'
option disabled '0'
option country 'GB'
I don't see the same errors in the kernel logs, but some devices just keep disconnecting every minute or so (especially my Android phone).
I have the same problem with version 382700c mentioned by @swg0101, above.
I can confirm @swg0101 observations after upgrading to a LEDE snapshot build with driver 10.3.8.0-20181210. After some time connected stations will stop being able to carry any traffic, and the kernel log is spammed with messages of the form Mac80211 start BA <macaddr>
. One station in which I see drops in the interval of minutes is a TP-Link WDR4300 running the same LEDE snapshot in WDS mode, but others seem to exhibit the same issue after different periods.
I can confirm this is still an issue as of today on the WRT1900ACS v2 with the latest heads/openwrt-18.06 (6e16dd1) mwlwifi c1345bb.
The 5Ghz card stops responding and the network disappears. The only solution is to reboot the router. This seems to happen more frequently when under heavy load, but I am not sure it is related.
Can anyone mention a stable version of mwlwifi/openwrt so I can try track down the issue?
Try removing ‘option country’ lines(both 2.4 and 5) completly and reboot.
Could you please explain why this should help? I would be helpful in understanding the root of the problem
On Thu, 7 Feb 2019, 12:28 intdev32, notifications@github.com wrote:
Try removing ‘option country’ line completly and reboot.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kaloz/mwlwifi/issues/308#issuecomment-461382319, or mute the thread https://github.com/notifications/unsubscribe-auth/ADz40UZnivSy8Vr8QDFCVp1ftK0FNhMYks5vLA3KgaJpZM4VPTIb .
I heard deletion of country code line could help some issues. This is not obvious solution. Just my suggestion. :) But, who knows? So, I think worth to try :)
Superstition and computers do not belong together :)
Just an update from me: my issues were actually caused by ND relaying, which would cause Android to disconnect for whatever reason. Unfortunately, I had to take the reprehensible step of disabling IPv6 until I have time to debug it further. So may not be related to the issues the rest of you are seeing.
@tbregolin What does "ND" stand for (in "ND relaying")? Thank you.
Neighbour Discovery.
So an update from me as well. I have reasons to believe the issue brought forward by the OP has to do with this commit e5e0700 which shortens the MAX_WAIT_FW_COMPLETE_ITERATIONS pcie comm time and causes timeout issues.
I downgraded the mwlwifi-firmware package to a version preceding that commit, i.e. ec0adbf on the 18.06.2 OpenWRT stable and the WiFi has been working flawlessly both on the 2.4 and on the 5Ghz.
The next step would be actually compiling mwlwifi-firmware with that commit reverted and test the WiFi stability to prove that's indeed the cause of the issue.
I can confirm that downgrading the mwlwifi-firmware package to version ec0adbf completely fixes the issue reported by many users in this thread, at least on my unit.
WiFi throughput is excellent, over > 500 Mbps (I am on an FTTH up to 1 Gbps). About 6-7 devices connected.
To be noted, perhaps, is the fact that DD-WRT (which I have tried briefly) uses that exact same firmware version and users report excellent performance and a stable connection.
More to follow once I actually recompile the firmware with MAX_WAIT_FW_COMPLETE_ITERATIONS set back to 10000 instead of 2000.
I can confirm this is still an issue as of today on the WRT1900ACS v2 with the latest custom build from Davidc502.
@ingamedeo Any update?
I am also experiencing timeouts and poor stability on my WRT1900AC V1. My configuration consist of 2 SSIDs on the 2.4GHz wireless. The timeouts and poor stability is apparently triggered when 2 or more guests(Android cellphones) are connected. I have tried OpenWrt v18.06.1, v18.06.2 and the latest DD-WRT. I also downgraded to ec0adbf to no avail.
@SuoaJ
What about the factory firmware?
Good question. I never tried it. Flashed OpenWrt as soon as I got the router about 1 month ago. Maybe, I should. Are you suggesting that I might get better results with stock firmware?
@cowwoc It was related to the commit I was mentioning before. After reverting that commit and compiling the latest version I do not experience any issue anymore.
@SuoaJ Check the log messages, are they similar to the ones posted by the OP?
@ingamedeo, I didn't observed these specific messages before, but now that you have mentioned it, I think I see something similar in my system and kernel logs, for example,
KERNEL LOG
[ 20.528096] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready [ 20.534609] br-lan: port 3(wlan0) entered blocking state [ 20.539940] br-lan: port 3(wlan0) entered forwarding state [ 20.601909] IPv6: ADDRCONF(NETDEV_UP): wlan0-1: link is not ready [ 20.697022] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0-1: link becomes ready [ 2082.685205] ieee80211 phy0: cmd 0x9101=SetApBeacon timed out [ 2082.691010] ieee80211 phy0: return code: 0x9136 [ 2082.695621] ieee80211 phy0: timeout: 0x1101 [ 2437.871201] ieee80211 phy0: cmd 0x9128=SetSpectrumMgmt timed out [ 2437.877259] ieee80211 phy0: return code: 0x9101 [ 2437.881804] ieee80211 phy0: timeout: 0x1128 [77104.796473] ieee80211 phy0: cmd 0x9128=SetSpectrumMgmt timed out [77104.802525] ieee80211 phy0: return code: 0x9101 [77104.807099] ieee80211 phy0: timeout: 0x1128 [126708.626648] ieee80211 phy0: cmd 0x9101=SetApBeacon timed out [126708.632471] ieee80211 phy0: return code: 0x9136 [126708.637353] ieee80211 phy0: timeout: 0x1101 [157716.797139] ieee80211 phy0: cmd 0x9128=SetSpectrumMgmt timed out [157716.803308] ieee80211 phy0: return code: 0x9101 [157716.808032] ieee80211 phy0: timeout: 0x1128 [157791.976000] ieee80211 phy0: cmd 0x9125=BAStream timed out [157791.981540] ieee80211 phy0: return code: 0x9101 [157791.986196] ieee80211 phy0: timeout: 0x1125 [166589.548330] ieee80211 phy0: cmd 0x9101=SetApBeacon timed out [166589.554142] ieee80211 phy0: return code: 0x9136 [166589.558804] ieee80211 phy0: timeout: 0x1101 [172890.881186] ieee80211 phy0: cmd 0x9101=SetApBeacon timed out [172890.887436] ieee80211 phy0: return code: 0x9136 [172890.892280] ieee80211 phy0: timeout: 0x1101 [340719.899916] ieee80211 phy0: cmd 0x9128=SetSpectrumMgmt timed out [340719.906165] ieee80211 phy0: return code: 0x9101 [340719.910827] ieee80211 phy0: timeout: 0x1128 [340919.540463] ieee80211 phy0: cmd 0x9128=SetSpectrumMgmt timed out [340919.546607] ieee80211 phy0: return code: 0x9101 [340919.551238] ieee80211 phy0: timeout: 0x1128
SYSTEM LOG
Thu Feb 21 20:00:27 2019 kern.err kernel: [340719.899916] ieee80211 phy0: cmd 0x9128=SetSpectrumMgmt timed out Thu Feb 21 20:00:27 2019 kern.err kernel: [340719.906165] ieee80211 phy0: return code: 0x9101 Thu Feb 21 20:00:27 2019 kern.err kernel: [340719.910827] ieee80211 phy0: timeout: 0x1128 Thu Feb 21 20:00:27 2019 daemon.notice hostapd: nl80211: nl80211_recv_beacons->nl_recvmsgs failed: -5 Thu Feb 21 20:00:27 2019 daemon.notice hostapd: nl80211: nl80211_recv_beacons->nl_recvmsgs failed: -5
look in the other thread and try to follow my advise and see what happens
@BrainSlayer, which other thread? Thanks.
Don't bother. Got the thread.
Don't bother. Got the thread.
please share
Can anyone confirm if this has been fixed in 18.06.2 ?
since the driver remains unmaintained and unchanged for this chipset for the last 2 years, nothing has been changed
@denibertovic This is still a problem in 2021
I recently upgrade a Linksys WRT1900ACS to the newest OpenWRT release candidate, and I observed significantly degraded stability, with dozens of Wi-fi clients experiencing high pings spikes and disconnections periodically. Looking at the system logs, I see the following kernel warnings showing up from time to time: