Open ghost opened 6 years ago
Same problem here.
If i do a: sudo BRANCH=next rpi-update Then i have a working situation. After this i got: uname -a Linux raspberrypi 4.14.17-v7+ #1090 SMP Mon Feb 5 21:02:18 GMT 2018 armv7l GNU/Linux
So i guess its kernel related.
Interesting. So not in the current rpi-update, but in the next branch? Should make the particular change a bit easier to find.
any news?
Not had a chance to look yet. How do you provoke the problem? We've not had many reports on B+ Wifi failing in this way, so apparently it's unusual. Have you updated to the very latest 4.14.xx kernel? Does that make any difference?
Once I have more details on how to replicate the issues, I can send the data to Cypress for investigation. The mailbox error, IIRC, is a firmware crash, so its not something we can really deal with here, since we do not have access to the firmware.
It may be slightly off topic because it's a different distribution, but it's pretty easy to trigger this error when using Kali Pi.
Putting the device into monitor mode (using mon0up
in Kali-Pi) and running aireplay-ng --test
causes it to emit the Unknown mailbox data content: 0x40012
error almost immediately. From then on, the wifi is worthless until you reboot.
For reference, mon0up
is a short shell script that runs iw phy phy0 interface add mon0 type monitor
and ifconfig mon0 up
and displays some brief info.
On a project I am working on, I am getting this constantly with Kali Pi when in monitor mode. Unloading and reloading the driver works sometimes and sometimes not. Sometimes it happens after 5 seconds, sometimes after 5 minutes, very random. Verified its not a hardware problem since I have 2 Pi's and both do it. I also loaded Raspbian latest and installed latest Nextmon drivers and I get the exact same thing, Kernel is 4.14.30-Re4son-v7+ on Kali. Don't have other offhand.
Do you get any errors listed in dmesg?
Yes I do.
The following is while running the following command, The Set Channel failed aren’t really errors it seems since I am scanning a range
/usr/sbin/airodump-ng -C 2412-5825 --write-interval 10 --write test --output-format netxml wlan0mon
Working until this point…….. [78833.305809] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012 [78835.790416] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [78835.797410] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [78835.803824] brcmfmac: brcmf_cfg80211_nexmon_set_channel: Set Channel failed: chspec=53409, -110 [78838.590434] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [78838.597330] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [78838.603733] brcmfmac: brcmf_cfg80211_nexmon_set_channel: Set Channel failed: chspec=53413, -110 [78841.390457] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [78841.397407] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [78841.403867] brcmfmac: brcmf_cfg80211_nexmon_set_channel: Set Channel failed: chspec=4098, -110 [78844.190483] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [78844.197466] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [78844.204088] brcmfmac: brcmf_cfg80211_nexmon_set_channel: Set Channel failed: chspec=4102, -110 [78846.990502] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [78846.997678] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [78847.004601] brcmfmac: brcmf_cfg80211_nexmon_set_channel: Set Channel failed: chspec=4106, -110 [78849.790519] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [78849.797885] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [78849.804936] brcmfmac: brcmf_cfg80211_nexmon_set_channel: Set Channel failed: chspec=4110, -110 [78852.590543] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [78852.597968] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [78852.605103] brcmfmac: brcmf_cfg80211_nexmon_set_channel: Set Channel failed: chspec=53284, -110 [78855.390563] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [78855.398051] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [78855.405211] brcmfmac: brcmf_cfg80211_nexmon_set_channel: Set Channel failed: chspec=53288, -110 [78858.190583] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [78858.198319] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [78858.205591] brcmfmac: brcmf_cfg80211_nexmon_set_channel: Set Channel failed: chspec=53292, -110 [78860.990609] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [78861.001145] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle …….
After shutting down airodump, I detected that it wasn’t capturing anything and attempted to unload at 79803.439351 and reload the driver at 79923.711128 and that failed. I have a 120sec timer between unload and reload of driver
[79782.517551] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [79782.529849] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [79782.541559] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110) [79785.157565] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [79785.169936] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [79785.181530] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -110 [79787.717590] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [79787.729703] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [79792.837633] brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -110 [79797.957685] brcmfmac: brcmf_cfg80211_del_ap_iface: interface_remove failed -110 [79800.517694] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110 [79803.078132] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do. [79803.089706] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-5) [79803.187727] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do. [79803.199528] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do. [79803.210553] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-5) [79803.439351] usbcore: deregistering interface driver brcmfmac [79923.696268] brcmfmac: F1 signature read @0x18000000=0x15264345 [79923.700624] brcmfmac: brcmf_fw_map_chip_to_name: using brcm/brcmfmac43455-sdio.bin for chip 0x004345(17221) rev 0x000006 [79923.711128] usbcore: registered new interface driver brcmfmac [79926.358647] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [79926.389146] brcmfmac: brcmf_c_get_clm_name: retrieving revision info failed (-110) [79926.400840] brcmfmac: brcmf_c_process_clm_blob: get CLM blob file name failed (-110) [79926.412661] brcmfmac: brcmf_c_preinit_dcmds: download CLM blob file failed, -110 [79926.423176] brcmfmac: brcmf_bus_started: failed: -110 [79926.431387] brcmfmac: brcmf_sdio_firmware_callback: dongle is not responding
… Tried unload and reload a little while later and it worked
[80685.169424] usbcore: deregistering interface driver brcmfmac [80805.455360] brcmfmac: F1 signature read @0x18000000=0x15264345 [80805.460216] brcmfmac: brcmf_fw_map_chip_to_name: using brcm/brcmfmac43455-sdio.bin for chip 0x004345(17221) rev 0x000006 [80805.471390] usbcore: registered new interface driver brcmfmac [80805.805728] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Apr 10 2018 18:33:56 version 7.45.154 (nexmon.org: 2.2.2-178-gd64f-1) FWID 01-4fbe0b04 [80805.814840] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2 Data: 9.10.105 Compiler: 1.29.4 ClmImport: 1.36.3 Creation: 2018-03-09 18:56:28
From: James Hughes [mailto:notifications@github.com] Sent: Wednesday, June 6, 2018 4:11 AM To: raspberrypi/linux linux@noreply.github.com Cc: Chris Douglas cdouglas@securustechnologies.com; Comment comment@noreply.github.com Subject: Re: [raspberrypi/linux] wlan freezes in raspberry pi 3B+ (#2453)
Do you get any errors listed in dmesg?
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/raspberrypi/linux/issues/2453#issuecomment-395000585, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGrL91e9G5lSPwp1PtKvluye60-lz8K8ks5t55yhgaJpZM4SwkCN.
Click herehttps://www.mailcontrol.com/sr/2xffoH0VKMPGX2PQPOmvUsgYVF5ojvBykkW9UzADy!8LKeLB79zZKi4csdmxhkuDqLqNh6I9BzeTKOxjbsImtQ== to report this email as spam.
Any news about this? I'm having this same issue. From dmesg:
[ 4.584121] brcmfmac: brcmf_fw_map_chip_to_name: using brcm/brcmfmac43455-sdio.bin for chip 0x004345(17221) rev 0x000006 [ 4.868470] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Feb 27 2018 03:15:32 version 7.45.154 (r684107 CY) FWID 01-4fbe0b04 [ 4.869048] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2 Data: 9.10.105 Compiler: 1.29.4 ClmImport: 1.36.3 Creation: 2018-03-09 18:56:28
I wish I never executed the rpi-update; I finally got the Raspberry Pi 3 B+ working as an AP + Managed Wifi at 5GHz, but now the AP doesn't work anymore after the update. How can I downgrade the brcmfmac to a stable/working version?
After "Unknown mailbox data content: 0x40012" is received, sometimes the communication can be recovered with the following commands:
modprobe -r brcmfmac
modprobe brcmfmac
Sometimes it doesn't recover even after modprobe -r (device is stuck). In that case, the following heavy-handed commands will fix the communication with the wifi device:
echo -n "3f300000.mmc" > /sys/devices/platform/soc/3f300000.mmc/driver/unbind
sleep 1
echo -n "3f300000.mmc" > /sys/bus/platform/drivers/mmc-bcm2835/bind
technical comment: rebinding the mmc driver will call probe(), which will call mmc:bcm2835_reset_internal(), which will power-cycle the SDIO device (SDVDD_POWER_OFF), which will properly reset & re-detect the wedged WIFI SDIO device. Phew!
I’ll try it Monday and let you know. Thank you very much.
Sent from my iPhone
On Jun 10, 2018, at 11:45 AM, gdb-power notifications@github.com<mailto:notifications@github.com> wrote:
After "Unknown mailbox data content: 0x40012" is received, sometimes the communication can be recovered with the following commands:
modprobe -r brcmfmac modprobe brcmfmac
Sometimes it doesn't recover even after modprobe -r (device is stuck). In that case, the following heavy-handed commands will fix the communication with the wifi device:
echo -n "3f300000.mmc" > /sys/devices/platform/soc/3f300000.mmc/driver/unbind sleep 1 echo -n "3f300000.mmc" > /sys/bus/platform/drivers/mmc-bcm2835/bind
technical comment: rebinding the driver will call probe(), which will call bcm2835_reset_internal(), which will power-cycle the SDIO device (SDVDD_POWER_OFF), which will properly reset the wedged WIFI SDIO device. Phew!
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/raspberrypi/linux/issues/2453#issuecomment-396063259, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGrL98NH4vq7R7-4fV5e33BpV9YdoCDTks5t7U0ZgaJpZM4SwkCN.
Click herehttps://www.mailcontrol.com/sr/FMgA5hvnUJvGX2PQPOmvUml+xXZX6IbqMNHcSExTXLGLfL5y4bkCVFXj2zyDaOvSKw2HpqN63pJ+U88fPeCTtg== to report this email as spam.
@Nemesis7 You can revert to any previous firmware+kernel package using rpi-update by putting the hash (string of hexadecimal digits) on the command line. The hashes can be found on the right hand side of the list of commits (releases): https://github.com/Hexxeh/rpi-firmware/commits/master
Alternatively you could return to the standard Raspbian kernel using:
sudo apt-get install --reinstall raspberrypi-bootloader raspberrypi-kernel
The heavy handed method seems to be working to get the driver working again, but is there a chance of this being fixed so its not necessary? Is this a hardware or firmware issue? Thank you
@cdouglas97 We currently do not know the cause of this. The mailbox error is the result of the firmware on the wireless chip dying, but the cause of that is unclear. We do NOT have access to the firmware source, that is provided as a binary by Cypress so we do rely on them to fix firmware issues.
I'm not looking at it at the moment, I have some other stuff to clear first. However, if anyone has a clear set of steps to replicate the problem on Raspbian, that would be very useful once I do start to look at it.
Thank you. I can reproduce it very easily. Steps:
wlan0 is not up, only active interface is eth0
/usr/sbin/airmon-ng start wlan0
/usr/sbin/airodump-ng -C 2412-5825 --write-interval 10 --write OUTPUTME --output-format netxml wlan0mon
It usually happens within 30 seconds to 5 minutes. I run the airodump for 60 seconds at a time with a 15 min gap between runs and then kill it so sometimes it completes its run and sometimes not. I added the heavy-handed commands before I start and it didn't seem to make a difference on how long it took for it to die. It just fixed the firmware if it died during a previous run.
-Chris
@cdouglas97 I am not familiar with airmon/airodump. What package do I need to install in Raspbian to get those?
sudo apt-get install aircrack-ng
With wlan0 available but not associated with an AP I get:
pi@raspberrypi:~ $ sudo airmon-ng start wlan0
Found 4 processes that could cause trouble.
If airodump-ng, aireplay-ng or airtun-ng stops working after
a short period of time, you may want to run 'airmon-ng check kill'
PID Name
319 avahi-daemon
353 dhcpcd
364 avahi-daemon
400 wpa_supplicant
PHY Interface Driver Chipset
phy0 wlan0 brcmfmac Broadcom 43430
ERROR adding monitor mode interface: command failed: Operation not supported (-95)
Any suggestions?
pelwell, the default Rasbian latest install firmware doesn't allow monitoring, have to use this: https://github.com/seemoo-lab/nexmon to make it work. Kali uses this and I took a Raspbian image a built patch as directed and both get same result.
Er, so the way to make this go wrong is to install some random third party stuff that plays around with the firmware in a way that probably wasn't intended by the developers? Why am I not surprised this might go wrong?
Unless this goes wrong with our standard firmware, I'm not sure we should spend any more time on this.
I don't blame you, I thought that it was understood from the first post that said he was putting into promiscuous mode that's what we were doing. Unless was are supposed to be able to put default firmware into monitor mode making the nexmon unnecessary.
Thanks for your time and the command to force the wifi reload.
-Chris
I don't know why you would think it should have been understood - promiscuous mode is not the same as monitor mode.
My mistake, I apologize.
From: Phil Elwell [mailto:notifications@github.com] Sent: Tuesday, June 12, 2018 6:23 AM To: raspberrypi/linux linux@noreply.github.com Cc: Chris Douglas cdouglas@securustechnologies.com; Mention mention@noreply.github.com Subject: Re: [raspberrypi/linux] wlan freezes in raspberry pi 3B+ (#2453)
I don't know why you would think it should have been understood - promiscuous mode is not the same as monitor mode.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/raspberrypi/linux/issues/2453#issuecomment-396555679, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGrL91R_VQ9jwYOM1WkEl-icoBhLME9Eks5t76SfgaJpZM4SwkCN.
Click herehttps://www.mailcontrol.com/sr/E8xEJgcIKGjGX2PQPOmvUp0m8S!KNwoj3Ra0zhZJ8qGVaGA06AtFwxmfWX1l6vtTJu!Oqka8PlqNc!rJiVGaJg== to report this email as spam.
The OP might not be doing the same thing as I am with the nexmon fw since he is just trying promiscuous mode so it still might be something you can investigate for him.
@pelwell @JamesH65 please don't give up on all of us who use the regular firmware... It is a real problem.
In the following conditions, the firmware crashes several times per day:
Raspberry PI running in an environment with alot a diverse wifi packets, such as a busy train station or a busy mall. Not a "regular" office environment or a shielded RF room...
Raspberry PI running as hotspot (thus continuously listening to incoming packets), preferably on a busy 2.4GHz channel (1,6,11), not on a clean 5GHz channel. There is no need for stations to be connected to the hotspot.
Note: the raspberry PI wifi crashes randomly in any environment and also when running just as station (not AP). I'm just describing the conditions that increase the probability of crashing.
Nobody's giving up. We have a number of difficult problems ongoing, and the Ethernet stalls are currently getting most of the attention, but now we have what should be a low-impact workaround for that the spotlight will turn to this issue (which looks suspiciously like an old Pi3B problem).
Another tip: when the firmware crashes, you can collect internal firmware stack traces here (you may need to compile the driver with DEBUG enabled):
cat /sys/kernel/debug/brcmfmac/mmc1:0001:1/forensics
In a busy train station/mall/office, when there are alot of people, you can sometimes catch several crashes per hour.
In an ideal world, broadcom/cypress would release the firmware source code so the community would be able to fix it, just like qualcomm released their internal wifi firmware. There are no big wifi secrets, all wifi firmware is pretty trivial:
@gdb-power - Your comment gives me an idea. If anybody has a second device with a monitor enabled WiFi adapter, you may be able to capture the series of frames that causes the firmware to lock up. The experiment would be something like:
airodump-ng
to capture all frames on the Raspberry Pi's channel.aireplay-ng
until the triggering frames are isolated.Of course, this assumes the hang is caused by incoming frames. If it's caused by invalid outgoing frames... I'm not sure how to capture that.
Fuzzing the cypress firmware for crashes is pretty trivial, several groups have done it in the past (including Google project zero). In order to fix the firmware, Broadcom/cypress should be doing the fuzzing, since they're the only one with access to the source code.
@gdb-power If you have any forensics dumps then we'd like to see them.
@llamasoft That sounds like a plan. The issue is likely to be triggered by packet reception - transmission is massively simpler because the driver gets to determine the timing and packet content.
I know I am out of this since I am using a patched driver to support monitoring, but this might help. All I am doing is running airodump-ng to scan for access points and it crashes. There are 42 in my vicinity that I usually detect.
From: Phil Elwell [mailto:notifications@github.com] Sent: Wednesday, June 13, 2018 10:51 AM To: raspberrypi/linux linux@noreply.github.com Cc: Chris Douglas cdouglas@securustechnologies.com; Mention mention@noreply.github.com Subject: Re: [raspberrypi/linux] wlan freezes in raspberry pi 3B+ (#2453)
@gdb-powerhttps://github.com/gdb-power If you have any forensics dumps then we'd like to see them.
@llamasofthttps://github.com/llamasoft That sounds like a plan. The issue is likely to be triggered by packet reception - transmission is massively simpler because the driver gets to determine the timing and packet content.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/raspberrypi/linux/issues/2453#issuecomment-396987984, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGrL99LoF6Bv4qWnN1THMFoX9W2br_4rks5t8TTngaJpZM4SwkCN.
Click herehttps://www.mailcontrol.com/sr/cqbIKFadkpbGX2PQPOmvUrwS+HatjSAfrPqB136TcZeWLAsnK4zsblWQviG7V1!ALAytrlG2tsMeq4Nrivo6iw== to report this email as spam.
As with all of these ethernet issues, replication is the crux. We need simple (ish) ways of making the problem happen, preferably in a situation where we can have debuggers and diagnostics tools attached (which makes a trip out of the office a real issue). The recent 3B+ issue suddenly got easier to solve when we had a user who could replicate at will and was able to help, and when I finally managed to be able to cause it on demand. So if ANYONE has a guaranteed way of causing this mailbox error that would be very useful.
On the 3B this mailbox error has been around since launch, but I thought it had been fixed with the most recent firmware upgrade. Certainly until this thread it hadn't been recently reported.
Yes, it's possible that the 43438 fix will also apply to the 43455.
I’ll help but I don’t know if I am a valid test since I am using patched firmware. I replicate it every 15 minutes like clockwork.
From: James Hughes [mailto:notifications@github.com] Sent: Wednesday, June 13, 2018 2:52 PM To: raspberrypi/linux linux@noreply.github.com Cc: Chris Douglas cdouglas@securustechnologies.com; Mention mention@noreply.github.com Subject: Re: [raspberrypi/linux] wlan freezes in raspberry pi 3B+ (#2453)
As with all of these ethernet issues, replication is the crux. We need simple (ish) ways of making the problem happen, preferably in a situation where we can have debuggers and diagnostics tools attached (which makes a trip out of the office a real issue). The recent 3B+ issue suddenly got easier to solve when we had a user who could replicate at will and was able to help, and when I finally managed to be able to cause it on demand. So if ANYONE has a guaranteed way of causing this mailbox error that would be very useful.
On the 3B this mailbox error has been around since launch, but I thought it had been fixed with the most recent firmware upgrade. Certainly until this thread it hadn't been recently reported. Cypress actually closed the case, but it looks like it will need to be reopened, once we can replicate it reliably.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/raspberrypi/linux/issues/2453#issuecomment-397063711, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGrL92UKBe7_WjY4vwEv_tqsK_ZPKLwvks5t8W1fgaJpZM4SwkCN.
Click herehttps://www.mailcontrol.com/sr/R8aiI74c9ILGX2PQPOmvUrwS+HatjSAf5NNxnNyzRjRwsYc14gZotX5KFDHZEpQZLAytrlG2tsPNfN+NdwfdFw== to report this email as spam.
@JamesH65 I am totally new to this but I hope my input can help you solve this issue (even without fixing the firmware). I am encountering this problem when I try to setup the integrated WiFi as a (managed) client and as an AP (incl. ip-forwarding) on a Raspberry Pi 3 B+. The first time around I followed this guide, which seemed to work: https://blog.thewalr.us/2017/09/26/raspberry-pi-zero-w-simultaneous-ap-and-managed-mode-wifi/, eventually it failed, I think it's because I didn't disable dhcpcd which didn't play nice with /etc/network/interfaces, but I want a better solution without the deprecated /etc/network/interfaces and without cronjobs.
Now I've done what is described here: https://www.raspberrypi.org/forums/viewtopic.php?f=36&t=138730&start=125#p1321390. This works pretty neat until the AP is activated, then I lose the WiFi connection and I get the mailbox error preceded by:
brcmfmac brcmf_link_down wlc_disassoc failed (-11)
This fails every time, no exception. Is there a firmware/kernel version I can try to see if it works there?
@pelwell @JamesH65 I sent you forensics dump by email.
Yes, thank you. We are sending them direct to Cypress who may then have enough information to fix the crashes.
Also, it might be a good idea to enable brcmfmac DEBUG flag in the raspberry pi default kernel, so that anyone can send forensic dumps.
The Cypress case is live, and they have assigned 'collaborators' to it, so hopefully we will hear something soon.
Guys, for me the crash stopped happening when I changed the order in which I put interfaces live: the wlan0 must be disabled, then enable the ap0, then enabled wlan0. In another sequence, it will crash.
@Nemesis7. Yes, but when wlan0 associates with an AP (e.g. using wpa_supplicant), traffic over ap0 (e.g. using hostapd) becomes slow/unreliable after some time (clients stay connected though). This does not happen as long as wlan0 is not used.
I have the strong impression that separation between wlan0 and the virtual interface ap0 is really messed up, since wpa_supplicant spits out messages like
ignored event (cmd=19) for foreign interface (ifindex 6 wdev 0x0)
,
and /sys/class/net/ap0/ifindex shows that it has ifindex 6, so what are these events doing on the wlan0 interface?
As soon as the station at wlan0 disconnects, things 'seem' to be running fine again on ap0.
Edit: When the mac address of wlan0 is changed, ap0 stops working. Tcpdump shows incoming data on ap0, but it looks like encrypted data. When the MAC of wlan0 is changed to original again, ap0 starts working and tcpdump shows nicely formatted ping packets. Imho it really points to a serious problem with the brcmfmac43430-sdio firmware.
Edit2: Even when ap0 is producing malformed(?) packets, and 'ap_isolate=1' is in hostapd.conf, it seems that clients associated to ap0 can still ping each other, but not the ap0 interface or any other ip-address beyond.
So, in utter confusion, I changed the iptables policy to 'iptables -P INPUT DROP', so I can't even ping the localhost from within a terminal on the raspberry, and strangely enough: clients on ap0 can still happily connect to each other.
So: is it an accepted security policy for this driver --which I suppose is not only used on rpi-- to ignore any rules (L2 or L3) imposed by the linux kernel?
OK, we have some beta software from Cypress, that may or may not help with this. The files on the following two links need to be copied to the /lib/firmware/brcm folder on the Pi. Note, there do appear to be some error messages reported when the driver first starts up, but doesn't seem to affect usage, however, I recommend backing up the two original files first. Currently talking with Cypress re: those error messages.
https://drive.google.com/file/d/1bqugahKmfz1uQe8u5VHijAUnuZTxGkvG/view?usp=sharing https://drive.google.com/file/d/1mbfEOMShLrul-qprmlcPSuERdCTdh4e7/view?usp=sharing
No success: driver fails to load.
Steps undertaken:
brcmfmac43455 is for 3B+ only (CYW43455): /proc/cpuinfo says rev=a020d3
sudo dd if=2018-06-27-raspbian-stretch-lite.img of=/dev/sdb bs=4M
cp brcmfmac43455-sdio.* /lib/firmware/brcm/
sha256sum /lib/firmware/brcm/brcmfmac43455-sdio.*
644b1afe735232a1b0c447e6f80650a9992f6977b80dc1d468c7302c769aa5d5 /lib/firmware/brcm/brcmfmac43455-sdio.bin
635bdcbf9dc2cf7dd3bb72480566f347966e95f3deb2fdb5615a4001c7dd2e77 /lib/firmware/brcm/brcmfmac43455-sdio.clm_blob
15698c62457bcf25e60d063e6c666d6e1b7dacdf2b03e6d14ebbc619de6da6b7 /lib/firmware/brcm/brcmfmac43455-sdio.txt
halt + powercycle: new driver fails
sh -c "echo options brcmfmac debug=0x100000 > /etc/modprobe.d/brcmfmac.conf"
apt update; apt upgrade + halt + powercycle: new driver fails
rpi-update + halt + powercycle: new driver fails
modprobe -r brcmfmac
echo -n "3f300000.mmc" > /sys/devices/platform/soc/3f300000.mmc/driver/unbind
echo -n "3f300000.mmc" > /sys/bus/platform/drivers/mmc-bcm2835/bind
modprobe brcmfmac (fails + stack trace)
copy original driver to /lib/firmware/brcm + powercycle: old driver ok.
repeat steps 9-12: old driver ok.
Same here - driver crashes and burns. No wifi interface.
[ 3.987473] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Jun 20 2018 20:26:28 version 7.45.165 (r692055 CY) FWID 01-1de59a68
[ 3.988042] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2 Data: 9.10.116 Compiler: 1.29.4 ClmImport: 1.36.3 Creation: 2018-06-20 20:12:36
[ 4.965242] uart-pl011 3f201000.serial: no DMA platform data
[ 7.751431] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[ 7.751442] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
[ 10.311449] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[ 10.311462] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
[ 11.465157] Bluetooth: Core ver 2.22
[ 11.465238] NET: Registered protocol family 31
[ 11.465244] Bluetooth: HCI device and connection manager initialized
[ 11.465263] Bluetooth: HCI socket layer initialized
[ 11.465277] Bluetooth: L2CAP socket layer initialized
[ 11.465311] Bluetooth: SCO socket layer initialized
[ 11.479666] Bluetooth: HCI UART driver ver 2.3
[ 11.479681] Bluetooth: HCI UART protocol H4 registered
[ 11.479687] Bluetooth: HCI UART protocol Three-wire (H5) registered
[ 11.479884] Bluetooth: HCI UART protocol Broadcom registered
[ 11.655598] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[ 11.655606] Bluetooth: BNEP filters: protocol multicast
[ 11.655623] Bluetooth: BNEP socket layer initialized
[ 12.871437] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[ 12.871450] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
[ 15.431426] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[ 15.431437] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
[ 17.991448] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[ 23.111431] brcmfmac: brcmf_dongle_scantime: Scan assoc time error (-110)
[ 25.671431] brcmfmac: brcmf_netdev_open: failed to bring up cfg80211
[ 28.231434] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[ 28.231446] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
[ 30.791428] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[ 30.791440] brcmfmac: brcmf_cfg80211_get_tx_power: error (-110)
[ 33.351436] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
[ 38.471439] brcmfmac: brcmf_dongle_scantime: Scan assoc time error (-110)
[ 41.031441] brcmfmac: brcmf_netdev_open: failed to bring up cfg80211
Hmm, although I get those messages, the WiFi interface does come up. I'll report back to cypress.
On Sun, 15 Jul 2018, 18:22 gdb-power, notifications@github.com wrote:
Same here - driver crashes and burns. No wifi interface.
[ 3.987473] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Jun 20 2018 20:26:28 version 7.45.165 (r692055 CY) FWID 01-1de59a68 [ 3.988042] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2 Data: 9.10.116 Compiler: 1.29.4 ClmImport: 1.36.3 Creation: 2018-06-20 20:12:36 [ 4.965242] uart-pl011 3f201000.serial: no DMA platform data [ 7.751431] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110 [ 7.751442] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110) [ 10.311449] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110 [ 10.311462] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110) [ 11.465157] Bluetooth: Core ver 2.22 [ 11.465238] NET: Registered protocol family 31 [ 11.465244] Bluetooth: HCI device and connection manager initialized [ 11.465263] Bluetooth: HCI socket layer initialized [ 11.465277] Bluetooth: L2CAP socket layer initialized [ 11.465311] Bluetooth: SCO socket layer initialized [ 11.479666] Bluetooth: HCI UART driver ver 2.3 [ 11.479681] Bluetooth: HCI UART protocol H4 registered [ 11.479687] Bluetooth: HCI UART protocol Three-wire (H5) registered [ 11.479884] Bluetooth: HCI UART protocol Broadcom registered [ 11.655598] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 11.655606] Bluetooth: BNEP filters: protocol multicast [ 11.655623] Bluetooth: BNEP socket layer initialized [ 12.871437] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110 [ 12.871450] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110) [ 15.431426] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110 [ 15.431437] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110) [ 17.991448] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110 [ 23.111431] brcmfmac: brcmf_dongle_scantime: Scan assoc time error (-110) [ 25.671431] brcmfmac: brcmf_netdev_open: failed to bring up cfg80211 [ 28.231434] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110 [ 28.231446] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110) [ 30.791428] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110 [ 30.791440] brcmfmac: brcmf_cfg80211_get_tx_power: error (-110) [ 33.351436] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110 [ 38.471439] brcmfmac: brcmf_dongle_scantime: Scan assoc time error (-110) [ 41.031441] brcmfmac: brcmf_netdev_open: failed to bring up cfg80211
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/raspberrypi/linux/issues/2453#issuecomment-405105285, or mute the thread https://github.com/notifications/unsubscribe-auth/ADqrHXmSuvqrWUTlVyCHJNKoR4xKsvhWks5uG3pCgaJpZM4SwkCN .
As a side-note, Firmware version 7.45.154 (Feb 27 2018) has a discrepancy between the powerstate reported by the driver and the powerstate reported by iwconfig. I suppose that it is somewhat related to this commit which disabled powersave, but it is confusing. Anyway, I understand it is always off.
$ iw wlan0 set power_save on
dmesg says: brcmfmac: power management disabled
iwconfig says: Power Management:on
$ iw wlan0 set power_save off
dmesg says: brcmfmac: power management disabled
iwconfig says: Power Management:off
So if ANYONE has a guaranteed way of causing this mailbox error that would be very useful.
[ 79.680414] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012 [ 82.154516] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [ 82.155047] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [ 84.714914] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [ 84.715403] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle [ 84.715417] brcmfmac: brcmf_c_set_joinpref_default: Set join_pref error (-110) [ 87.154353] brcmfmac: brcmf_cfg80211_connect: BRCMF_C_SET_SSID failed (-110) [ 89.675516] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout [ 89.676054] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
I believe the crash is related to the sudden channel switch imposed by hostapd. Omitting the channel=6 setting makes hostapd to do an ACS survey, but this is not supported by the firmware it seems. The driver doesn't crash, but hostapd does not start either. In any case, the driver should decide what to do: ignoring the hostapd channel or leaving the STA channel. There's no middle road. Fixing this issue may also resolve the intermittent crashes every few hours or days of the STA-only mode. It could be that a scheduled AP scan by wpa_supplicant causes the same (race) condition in the driver, although far less frequent.
The driver can be resurrected by the unbind/bind procedure, however I'd like it to stay alive.
For now, a watchdog on the /sys/class/net/wlan0/operstate is our only fallback. The problem is that driver condition (e.g. wifi speed) can also deteriorate without crashing.
Alas, there is no /sys/kernel/debug/brcmfmac/mmc1\:0001\:1/healthcondition
metric.
cat /sys/kernel/debug/brcmfmac/mmc1\:0001\:1/counters
tx_ctlerrs: 25
rx_ctlerrs: 19
/home/pi/wpa_supplicant.conf:
ap_scan=1
ctrl_interface=/var/run/wpa_supplicant
network={
ssid="E2000"
scan_ssid=1
proto=WPA RSN
key_mgmt=WPA-PSK
pairwise=CCMP TKIP
group=CCMP TKIP
psk="myrouterpassword"
}
/home/pi/hostapd.conf:
interface=uap0
ssid=raspberrypi
hw_mode=g
channel=6
wmm_enabled=0
macaddr_acl=0
auth_algs=1
ignore_broadcast_ssid=0
wpa=2
wpa_passphrase=myhostapdpassword
wpa_key_mgmt=WPA-PSK
rsn_pairwise=CCMP
wpa_pairwise=TKIP
ap_isolate=1
cat /sys/kernel/debug/brcmfmac/mmc1\:0001\:1/forensics
dongle trap info: type 0x4 @ epc 0x00021e2c cpsr 0x8000019f spsr 0x800001bf sp 0x0025fcb8 lr 0x00020447 pc 0x00021e2c offset 0x25fc60 r0 0x002475b8 r1 0x00000000 r2 0x00000000 r3 0x00000000 r4 0x00259e30 r5 0x002412f4 r6 0x002475b8 r7 0x00000004 0x0 000310.996 cca_stats_watchdog: Bad chanspec!! 000310.996 wl0: cca_stats watchdog handler error 000311.115 wl0: wlc_iovar_op: wpaie BCME -7 (Not STA) 000311.119 FWID 01-4fbe0b04 flags 1 000311.119 TRAP 4(25fc60): pc 21e2c, lr 20447, sp 25fcb8, cpsr 8000019f, spsr 800001bf 000311.119 dfsr 80d, dfar e0 000311.119 r0 2475b8, r1 0, r2 0, r3 0, r4 259e30, r5 2412f4, r6 2475b8 000311.119 r7 4, r8 cd, r9 25aa64, r10 2471b4, r11 245c48, r12 204dc 000311.119 sp+0 00240b3d 00000004 00000001 00245c48 000311.119 sp+10 0023d546 0023d5c0 00010030 00000000
000311.119 sp+48 00001c01 000311.119 sp+54 0008b59b 000311.119 sp+74 001a60d9 000311.119 sp+94 0019adfd 000311.119 sp+ac 001a63b9 000311.119 sp+cc 0003fc21 000311.119 sp+fc 00020361 000311.119 sp+178 000203b9 000311.119 sp+18c 00025a97 000311.119 sp+1b4 0001f361 000311.119 sp+1d0 00000107 000311.119 sp+1d4 0001f231 000311.119 sp+1f4 00006583 000311.119 sp+214 0019c541 000311.119 sp+27c 0019b9bb
I've passed on the details of the issues with this test firmware to Cypress. Interstingly, although this is new and in test from our point of view, I believe it's actually their top of tree build, so I would not be expecting issues this clear to be turning up. Most strange. Anyway, we now wait for Cypress.
Further testing:
wl0: wlc_iovar_op: wpaie BCME -7 (Not STA)
dongle trap info: type 0x4 @ epc 0x00021e2c
If wpa_supplicant is stopped at (5) and restarted, driver continues normally (until it decides to crash later of course).
So it is not a race condition, but an fsm state (isolation) problem. Starting beacons on uap0 results in wlan0 assuming it is in AP mode.
(see also https://github.com/raspberrypi/linux/issues/1342 )
I've also got that problem with wifi dying.
This is with 4.14.27-v7+ and with /sbin/iw dev wlan0 set power_save off /sbin/ifconfig wlan0 promisc in /etc/rc.local.