RPi-Distro / firmware-nonfree

184 stars 103 forks source link

CTRL-EVENT-ASSOC-REJECT status_code=16 #38

Open hmeerlo opened 1 year ago

hmeerlo commented 1 year ago

Heads-up: this is a cross post from theraspberrypi/firmware repo because I realized it belonged here.

The internet is filled with this issue. We have been trying to find and fix this problem for weeks now. But there is no solution yet. We have a consumer product using the cm4. It has the cypress/infineon 43455 chip and the problem we're facing appears to come from the firmware on this chip. Some customers have this problem, others do not experience it (so it seems to depend on the wifi environment as well). Every now and then the system log starts spewing these messages:

2023-09-24 17:40:33.1628 wpa_supplicant[299]: nl80211: Ignored event 46 (NL80211_CMD_CONNECT) for foreign interface (ifindex 3 wdev 0x0)
2023-09-24 17:40:33.1631 wpa_supplicant[299]: nl80211: Drv Event 46 (NL80211_CMD_CONNECT) received for wlan0
2023-09-24 17:40:33.1634 wpa_supplicant[299]: nl80211: Connect event (status=16 ignore_next_local_disconnect=0)
2023-09-24 17:40:33.1637 wpa_supplicant[299]: wlan0: Event ASSOC_REJECT (12) received
2023-09-24 17:40:33.1640 wpa_supplicant[299]: wlan0: CTRL-EVENT-ASSOC-REJECT bssid=00:00:00:00:00:00 status_code=16
2023-09-24 17:40:33.1643 wpa_supplicant[299]: wlan0: Radio work 'connect'@0x559d6244f0 done in 0.901671 seconds

It then fails to connect to the AP. We would like to know what this status=16 means. Maybe when we know this we can try to work around it. It is not easily reproducable in a lab environment unfortunately.

I attached a full log of the problem where it sometimes connects but also gets this error quite often. 1ED95DB6-github.txt

System

hmeerlo commented 1 year ago

Is there anybody who can comment on this issue? The wifi experience on the cm4 is really really bad. We have so many customers complaining with all kinds of issues. So the above is the not the only problem.

pelwell commented 1 year ago

The problem with many of these reported WiFi problems is that they lack sufficient information to reproduce them - as you say, it depends on the environment, and clearly our environments don't tickle the problem.

One thing that might be worth a try is disabling the firmware's roaming capability. Does adding brcmfmac.roamoff=1 to cmdline.txt make a difference?

hmeerlo commented 1 year ago

Oh I have already done that. Doesn't make any difference. We have thousands of customers, so they have many diverse wifi environments. I personally also don't have the problem, so it does seem to be environment related. But one of my colleagues has the problem on a daily basis. It manifests in a different way, but it mostly comes down to 4way handshake problems. And I think they almost always originate in the wifi driver. We are already testing the newly released infineon driver but that doesn't improve the situation. This is one of the other failures:

2023-10-13 13:27:06.1723 wpa_supplicant[301]: RTM_NEWLINK: ifi_index=3 ifname=wlan0 wext ifi_family=0 ifi_flags=0x11043 ([UP][RUNNING][LOWER_UP])
2023-10-13 13:27:06.1726 wpa_supplicant[301]: nl80211: Event message available
2023-10-13 13:27:06.1728 wpa_supplicant[301]: nl80211: Ignored event 48 (NL80211_CMD_DISCONNECT) for foreign interface (ifindex 3 wdev 0x0)
2023-10-13 13:27:06.1729 wpa_supplicant[301]: nl80211: Drv Event 48 (NL80211_CMD_DISCONNECT) received for wlan0
2023-10-13 13:27:06.1731 wpa_supplicant[301]: nl80211: Disconnect event
2023-10-13 13:27:06.1732 wpa_supplicant[301]: wlan0: Event DEAUTH (11) received
2023-10-13 13:27:06.1734 wpa_supplicant[301]: wlan0: Deauthentication notification
2023-10-13 13:27:06.1735 wpa_supplicant[301]: wlan0:  * reason 2 (PREV_AUTH_NOT_VALID)
2023-10-13 13:27:06.1736 wpa_supplicant[301]: Deauthentication frame IE(s) - hexdump(len=0): [NULL]
2023-10-13 13:27:06.1738 wpa_supplicant[301]: wlan0: CTRL-EVENT-DISCONNECTED bssid=48:d3:43:92:c9:5f reason=2
2023-10-13 13:27:06.1739 wpa_supplicant[301]: wlan0: Auto connect enabled: try to reconnect (wps=0/0 wpa_state=9)
2023-10-13 13:27:06.1740 wpa_supplicant[301]: wlan0: Setting scan request: 0.100000 sec
2023-10-13 13:27:06.1742 wpa_supplicant[301]: Added BSSID 48:d3:43:92:c9:5f into blacklist

This apparently means that the client (cm4) did not respond to one of the first messages of the handshake.

hmeerlo commented 1 year ago

This sums the situation up pretty well: https://rachelbythebay.com/w/2023/11/06/wpa3/ How can you start selling a brand new product with a driver that is so extremely outdated and bad? Raspberry Pi really needs to make a proper plan for this and get Infineon on board in updating this driver to the latest standards.

pimlie commented 1 year ago

The problem with many of these reported WiFi problems is that they lack sufficient information to reproduce them - as you say, it depends on the environment, and clearly our environments don't tickle the problem.

@pelwell Using a RPi 0W2 with the latest 64bit image (tried both desktop & lite) I can consistently reproduce this error trying to connect to 3 different Mikrotik devices running two different software versions (routeros latest v7 using wifiwave2 & latest v6). The 0W2 connects fine to a hotspot on my phone, but I just cannot connect to any of those 3 Mikrotik ap's. The only thing the mikrotik devices report is that unicast key exchange failed. Does that help to try to narrow it down? Let me know if you need more info.


After much more searching I came accross this SO, and indeed turning off fast roaming & disabling SAE for the brcmfmac module made the 0W2 finally connect succesfully

modprobe brcmfmac roamoff=1 feature_disable=0x82000
parsnatura commented 8 months ago

@pimlie Thx, it helped with my RPi 0W2, too. Btw also using Mikrotik RouterOS v7.

luftdieb commented 7 months ago

Hi together, I had same error code 18 with my Dell Latitude 6520 Laptop (Linux Mint 21.3 Virginia base: Ubuntu 22.04 jammy / Kernel: 5.15.0-101-generic x86_64 bits) and Mikrotik router RB4011iGS / Firmware v7.14.1 configured with CAPsMAN. In Channels configuration I had previously selcted Band "5Ghz-onlyac". But after reading this post, I''ve got the idea changing to "5Ghz-n/ac" and now, Dell Laptop is connecting fine and no error 18 is reported anymore.

Beside this, I have this additional settings, which I'm not sure, if it makes sense or not... but it works fine on my end now. Freqzency: 5180, 5260,5500,5580,5660,5745 Control Chanel With: 20 Mhz Band: "5Ghz-n/ac"

Maybe this helps others :-)

Zanz2 commented 7 months ago

The problem with many of these reported WiFi problems is that they lack sufficient information to reproduce them - as you say, it depends on the environment, and clearly our environments don't tickle the problem.

@pelwell Using a RPi 0W2 with the latest 64bit image (tried both desktop & lite) I can consistently reproduce this error trying to connect to 3 different Mikrotik devices running two different software versions (routeros latest v7 using wifiwave2 & latest v6). The 0W2 connects fine to a hotspot on my phone, but I just cannot connect to any of those 3 Mikrotik ap's. The only thing the mikrotik devices report is that unicast key exchange failed. Does that help to try to narrow it down? Let me know if you need more info.

  • Using the same sd-card in a RPi 4 connects to the wifi without issue.
  • With the RPi 4 I could run apt update && apt upgrade, this installed some newer firmware. After rebooting, wifi would still connect using the RPi 4, but
  • after inserting the sd-card (ie with latest updates) back into the 0W2 the wifi still fails to connect.
  • Unfortunately I only have one 0W2 to test with

After much more searching I came accross this SO, and indeed turning off fast roaming & disabling SAE for the brcmfmac module made the 0W2 finally connect succesfully

modprobe brcmfmac roamoff=1 feature_disable=0x82000

Same pi, same hardware, the mikrotik worked on both my pi3's also my pi5, but not my zeros, your fix worked, i sunk 2 whole days into this. If i could i'd buy you a coffee

txoof commented 3 months ago

I have a similar issue, but it seems to only impact raspberry pi 3 units. Both units have static IPs, both I've checked the router and the DHCP lease is set for one day.

I have two Pi3 and two Pi4 running on the local network and the problem always occurs at 38 minutes past the hour and only impacts the Pi3 units, and always with reason=16.

Restarting the NetworkManager service temporarily resolves this issue. Wired connections are not impacted.

The Pi4 and Pi3 are configured using the same minimal setup. All four devices are on the same release:

PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Here's a sample of the wpa_supplicant errors occurring at 38 minutes past the hour:

ug 03 15:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 03 15:38:56 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=6
Aug 03 17:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 03 17:38:06 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=15
Aug 03 18:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 03 19:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 03 20:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 03 21:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 03 22:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 03 23:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 04 00:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 04 02:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 04 03:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 04 04:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 04 04:38:07 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=15
Aug 04 05:38:01 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 04 05:38:06 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=15
Aug 04 07:38:02 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 04 08:38:02 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 04 08:38:06 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=15
Aug 04 09:38:02 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 04 10:38:02 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 04 12:38:02 office-paperpi wpa_supplicant[563]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 05 09:38:02 office-paperpi wpa_supplicant[388859]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 05 10:38:02 office-paperpi wpa_supplicant[388859]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 05 11:38:02 office-paperpi wpa_supplicant[388859]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16
Aug 05 11:38:07 office-paperpi wpa_supplicant[388859]: wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=15

Here's the full error output:

2024-08-05T12:38:02.334 2633844492 wpa_supplicant.service wpa_supplicant[571] STATS wlan0: CTRL-EVENT-DISCONNECTED bssid=e0:63:da:18:43:db reason=16                                                                                │
│2024-08-05T12:38:02.343 2633853589 wpa_supplicant.service wpa_supplicant[571] STATS wlan0: CTRL-EVENT-REGDOM-CHANGE init=CORE type=WORLD                                                                                            │
│2024-08-05T12:38:02.345 2633855621 wpa_supplicant.service wpa_supplicant[571] STATS wlan0: CTRL-EVENT-REGDOM-CHANGE init=USER type=COUNTRY alpha2=NL                                                                                │
│2024-08-05T12:38:02.350 2633860104 NetworkManager.service NetworkManager[568] INFO <info>  [1722854282.3494] device (wlan0): supplicant interface state: completed -> disconnected                                                  │
│2024-08-05T12:38:02.351 2633862032 NetworkManager.service NetworkManager[568] INFO <info>  [1722854282.3495] device (p2p-dev-wlan0): supplicant management interface state: completed -> disconnected                               │
│2024-08-05T12:38:02.442 2633952728 NetworkManager.service NetworkManager[568] INFO <info>  [1722854282.4424] device (wlan0): supplicant interface state: disconnected -> scanning                                                   │
│2024-08-05T12:38:02.443 2633953352 NetworkManager.service NetworkManager[568] INFO <info>  [1722854282.4426] device (p2p-dev-wlan0): supplicant management interface state: disconnected -> scanning                                │
│2024-08-05T12:38:03.027 2634537272 wpa_supplicant.service wpa_supplicant[571] STATS wlan0: Trying to associate with SSID 'Shackleton N'                                                                                             │
│2024-08-05T12:38:03.042 2634552174 NetworkManager.service NetworkManager[568] INFO <info>  [1722854283.0416] device (wlan0): supplicant interface state: scanning -> associating                                                    │
│2024-08-05T12:38:03.042 2634552950 NetworkManager.service NetworkManager[568] INFO <info>  [1722854283.0419] device (p2p-dev-wlan0): supplicant management interface state: scanning -> associating                                 │
│2024-08-05T12:38:03.156 2634666316 wpa_supplicant.service wpa_supplicant[571] STATS wlan0: Associated with e0:63:da:18:43:db                                                                                                        │
│2024-08-05T12:38:03.156 2634666944 wpa_supplicant.service wpa_supplicant[571] STATS wlan0: CTRL-EVENT-CONNECTED - Connection to e0:63:da:18:43:db completed [id=0 id_str=]                                                          │
│2024-08-05T12:38:03.157 2634667537 wpa_supplicant.service wpa_supplicant[571] STATS wlan0: CTRL-EVENT-SUBNET-STATUS-UPDATE status=0                                                                                                 │
│2024-08-05T12:38:03.158 2634668476 wpa_supplicant.service wpa_supplicant[571] STATS wlan0: CTRL-EVENT-REGDOM-CHANGE init=COUNTRY_IE type=COUNTRY alpha2=NL                                                                          │
│2024-08-05T12:38:03.163 2634673231 NetworkManager.service NetworkManager[568] INFO <info>  [1722854283.1630] device (wlan0): supplicant interface state: associating -> completed                                                   │
│2024-08-05T12:38:03.163 2634673794 NetworkManager.service NetworkManager[568] INFO <info>  [1722854283.1631] device (p2p-dev-wlan0): supplicant management interface state: associating -> completed                                │
│2024-08-05T12:38:03.164 2634674129 NetworkManager.service NetworkManager[568] INFO <info>  [1722854283.1632] device (wlan0): ip:dhcp4: restarting                                                                                   │
│2024-08-05T12:38:03.182 2634692317 NetworkManager.service NetworkManager[568] INFO <info>  [1722854283.1820] dhcp4 (wlan0): canceled DHCP transaction                                                                               │
│2024-08-05T12:38:03.182 2634692906 NetworkManager.service NetworkManager[568] INFO <info>  [1722854283.1822] dhcp4 (wlan0): activation: beginning transaction (timeout in 45 seconds)                                               │
│2024-08-05T12:38:03.183 2634693265 NetworkManager.service NetworkManager[568] INFO <info>  [1722854283.1823] dhcp4 (wlan0): state changed no lease                                                                                  │
│2024-08-05T12:38:03.183 2634693570 NetworkManager.service NetworkManager[568] INFO <info>  [1722854283.1826] dhcp4 (wlan0): activation: beginning transaction (timeout in 45 seconds)                                               │
│2024-08-05T12:38:03.400 2634911011 NetworkManager.service NetworkManager[568] INFO <info>  [1722854283.4004] dhcp4 (wlan0): state changed new lease, address=192.168.1.127
radu-v commented 2 months ago

One thing that might be worth a try is disabling the firmware's roaming capability. Does adding brcmfmac.roamoff=1 to cmdline.txt make a difference?

this worked for me, on Raspberry Pi 5, thanks @pelwell!

irabbit79 commented 4 days ago

Had the same problem when connecting USB3 HDD , https://forums.raspberrypi.com/viewtopic.php?t=328248

txoof commented 22 hours ago

I eventually found that the problem stemmed from the Group Rekey Interval on the APs. The reason the the problem reappeared every hour was related to the rekey interval being 3600 seconds. I changed the rekey interval to 1800 seconds and the problem disappeared. After a few weeks of stability, I moved the rekey interval back to 3600 and the problem has not returned.

I honestly have no idea why this would have anything to do with anything, but it solved my problem.

I have a similar issue, but it seems to only impact raspberry pi 3 units. Both units have static IPs, both I've checked the router and the DHCP lease is set for one day.

I have two Pi3 and two Pi4 running on the local network and the problem always occurs at 38 minutes past the hour and only impacts the Pi3 units, and always with reason=16.

Restarting the NetworkManager service temporarily resolves this issue. Wired connections are not impacted.

The Pi4 and Pi3 are configured using the same minimal setup. All four devices are on the same release:

pelwell commented 17 hours ago

Thanks - that's a useful observation.