Closed DrSchottky closed 4 years ago
Hi Schottky,
Thanks for the detailed analysis. When you get the firmware trap, do you see a dump of the registers? If you can tell me the program counter value, we could investigate it further. However, as you describe it, it seems like an error in the firmware itself or in the driver, that needs to be patched by Broadcom, at least when it happens with an unpatched wl_monitor function.
Matthias
Am Mo., 28. Jan. 2019, 20:37 hat DrSchottky notifications@github.com geschrieben:
Monitor mode on 3B+ in prone to crashing. I played around with patches, different versions of driver and FW but always got the same result: after a certain amount of time the chipset stops responing. As far as I can see the time before the crash depends on how much "crowded" the surrounding environment is: the more APs/Stations there are, the faster it crashes (it goes from 50-60mins at my home to 1-3mins at office). Even the error is slightly different: in less crowded environments it just stops responding (error -110), in more crowded ones it thows a Unknown mailbox data content: 0x40012 followed by a FW trap error and then it times out. Whilst you can recover from the first error by reloading the driver the latest requires a power cycle of the chipset.
What I tried so far
-
Different boards
Different kernels
Different driver versions
Different FW versions ( I ported nexmon to 7.45.173)
and nothing changed.
Steps to reproduce the issue:
-
Build and install driver+fw
Add mon0
Run airodump-ng
Wait until it crashes
I don't think it's something related to patches, since even without hooking wl_monitor (using the built-in monitor mode) it crashes, probably even earlier. Since crashes are reproducible and happens approximately after the same amount of time I'm prone to think that something somewhere is overflowing.
Unfortunately I have zero clue about how to debug this kind of issues. If anyone has suggestions they're appreciated.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/seemoo-lab/nexmon/issues/280, or mute the thread https://github.com/notifications/unsubscribe-auth/ALP_7iSPZN84SjSvlR3RJe3ebLV16OvJks5vH1F1gaJpZM4aWoIA .
Hi Matthias, thank you for your reply
Unfortunately all I got (with default brcmfmac verbosity) is
kernel: [ 3855.649118] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
kernel: [ 3858.123013] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
kernel: [ 3858.123514] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
followed by a sequence of timeouts (mainly due to failed channels hop)
Is there a way to get the registers dump you need?
What makes me scratching my head is that Cypress actually sell their monitor mode as a working feature (through wl), so it's hard for me to believe that the fw is screwed up so bad. However I even tried with their latest brcmfmac with just the nexmon patches, so...
Have you got any other hints about how to at least identify where the problem can be?
Thank you
You could search where the trap error is printed in the driver and add a call to a function that dumps the console or you dump the console by directly accessing the console area in the wifi chips ram.
Am Mo., 28. Jan. 2019, 22:36 hat DrSchottky notifications@github.com geschrieben:
Hi Matthias, thank you for your reply
Unfortunately all I got (with default brcmfmac verbosity) is kernel: [ 3855.649118] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012 kernel: [ 3858.123013] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout kernel: [ 3858.123514] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle followed by a sequence of timeouts (mainly due to failed channels hop)
Is there a way to get the registers dump you need?
What makes me scratching my head is that Cypress actually sell their monitor mode as a working feature (through wl), so it's hard for me to believe that the fw is screwed up so bad. However I even tried with their latest brcmfmac with just the nexmon patches, so...
Have you got any other hints about how to at least identify where the problem can be?
Thank you
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/seemoo-lab/nexmon/issues/280#issuecomment-458311026, or mute the thread https://github.com/notifications/unsubscribe-auth/ALP_7pPNVRKBNgjltZM_D_NWwHSZndVmks5vH21WgaJpZM4aWoIA .
Finding where trap is detected in driver shouldn't be a problem, but about console dump I have a couple of doubts:
You cannot handle a console dump ioctl if the firmware is crashed. However, the console output is written into a ring buffer in the firmware's RAM. So as long as the driver can access the RAM, you can dump the console.
Am Mo., 28. Jan. 2019, 23:24 hat DrSchottky notifications@github.com geschrieben:
Finding where trap is detected in driver shouldn't be a problem, but about console dump I have a couple of doubts:
- Is there a way to dump console on a standard nexmon fw? Afaik there's a custom cmd to do it in rom_extraction fw, so have I to include it in standard nexmon fw?
- If fw is in trap state how can it handle a dump cmd?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/seemoo-lab/nexmon/issues/280#issuecomment-458326412, or mute the thread https://github.com/notifications/unsubscribe-auth/ALP_7hLaduh5RwhHhEt2YAX-YhR5IOUiks5vH3iggaJpZM4aWoIA .
Do you think it can be done with brcmf_debug_create_memdump ?
Try brcmf_sdio_readconsole
Am Mo., 28. Jan. 2019, 23:43 hat DrSchottky notifications@github.com geschrieben:
Do you think it can be done with brcmf_debug_create_memdump https://github.com/seemoo-lab/nexmon/blob/master/patches/bcm43455c0/7_45_154/nexmon/brcmfmac_4.14.y-nexmon/debug.c#L30 ?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/seemoo-lab/nexmon/issues/280#issuecomment-458332140, or mute the thread https://github.com/notifications/unsubscribe-auth/ALP_7vRffC-ljmY-2jl-4X5KjA0ZiQLfks5vH30UgaJpZM4aWoIA .
Hi Matthias, I added the call to brcmf_sdio_readconsole here and I let it crash (this time it took only a few seconds). Here's the output:
Jan 29 13:59:57 raspberrypi kernel: [ 159.491534] device mon0 entered promiscuous mode
Jan 29 14:00:00 raspberrypi kernel: [ 162.756820] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Jan 29 14:00:00 raspberrypi kernel: [ 162.757346] brcmfmac: CONSOLE:
Jan 29 14:00:00 raspberrypi kernel: [ 162.757352] brcmfmac: CONSOLE: 000068.607
Jan 29 14:00:00 raspberrypi kernel: [ 162.757360] brcmfmac: CONSOLE: TRAP 4(25fc78): pc 19ae62, lr 19ae45, sp 25fcd0, cpsr 2000019f, spsr 200001bf
Jan 29 14:00:00 raspberrypi kernel: [ 162.757365] brcmfmac: CONSOLE: 000068.608 dfsr 8, dfar deadbef4
Jan 29 14:00:00 raspberrypi kernel: [ 162.757374] brcmfmac: CONSOLE: 000068.608 r0 23490c, r1 0, r2 34944, r3 deadbeef, r4 25c0a8, r5 23490c, r6 0
Jan 29 14:00:00 raspberrypi kernel: [ 162.757380] brcmfmac: CONSOLE: 000068.608 r7 25c0d0, r8 1, r9 2, r10 2, r11 18, r12 25fed4
Jan 29 14:00:00 raspberrypi kernel: [ 162.757384] brcmfmac: CONSOLE: 000068.608
Jan 29 14:00:00 raspberrypi kernel: [ 162.757390] brcmfmac: CONSOLE: sp+0 00000000 00225c3c 00000000 00250900
Jan 29 14:00:00 raspberrypi kernel: [ 162.757396] brcmfmac: CONSOLE: 000068.608 sp+10 00000001 00010041 0023490c 0023490c
Jan 29 14:00:00 raspberrypi kernel: [ 162.757400] brcmfmac: CONSOLE:
Jan 29 14:00:00 raspberrypi kernel: [ 162.757405] brcmfmac: CONSOLE: 000068.608 sp+14 00010041
Jan 29 14:00:00 raspberrypi kernel: [ 162.757409] brcmfmac: CONSOLE: 000068.608 sp+24 00020a03
Jan 29 14:00:00 raspberrypi kernel: [ 162.757414] brcmfmac: CONSOLE: 000068.608 sp+28 0009cd7b
Jan 29 14:00:00 raspberrypi kernel: [ 162.757419] brcmfmac: CONSOLE: 000068.608 sp+40 0009e175
Jan 29 14:00:00 raspberrypi kernel: [ 162.757424] brcmfmac: CONSOLE: 000068.608 sp+64 00003883
Jan 29 14:00:00 raspberrypi kernel: [ 162.757429] brcmfmac: CONSOLE: 000068.608 sp+84 00000a65
Jan 29 14:00:00 raspberrypi kernel: [ 162.757434] brcmfmac: CONSOLE: 000068.608 sp+bc 00010bff
Jan 29 14:00:00 raspberrypi kernel: [ 162.757439] brcmfmac: CONSOLE: 000068.608 sp+c4 0000860b
Jan 29 14:00:00 raspberrypi kernel: [ 162.757443] brcmfmac: CONSOLE: 000068.608 sp+d0 00007f19
Jan 29 14:00:00 raspberrypi kernel: [ 162.757449] brcmfmac: CONSOLE: 000068.608 sp+d4 00007fbd
Jan 29 14:00:00 raspberrypi kernel: [ 162.757454] brcmfmac: CONSOLE: 000068.608 sp+10c 0019b2f9
Jan 29 14:00:00 raspberrypi kernel: [ 162.757458] brcmfmac: CONSOLE: 000068.608 sp+124 0019b4e9
Jan 29 14:00:00 raspberrypi kernel: [ 162.757464] brcmfmac: CONSOLE: 000068.608 sp+15c 0019c853
Jan 29 14:00:00 raspberrypi kernel: [ 162.757468] brcmfmac: CONSOLE: 000068.608 sp+170 00000985
Jan 29 14:00:00 raspberrypi kernel: [ 162.757473] brcmfmac: CONSOLE: 000068.608 sp+18c 001a692d
Jan 29 14:00:00 raspberrypi kernel: [ 162.757479] brcmfmac: CONSOLE: 000068.608 sp+1b0 0000ffff
Jan 29 14:00:03 raspberrypi kernel: [ 165.685464] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
Jan 29 14:00:03 raspberrypi kernel: [ 165.686001] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
My current setup is kernel 4.9.80 with vanilla (except for console output) nexmon
UPDATE: I reproduced the test several times and sometimes i got slightly different results, like this
Jan 29 16:05:08 raspberrypi kernel: [ 189.121371] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
Jan 29 16:05:08 raspberrypi kernel: [ 189.121873] brcmfmac: CONSOLE: nd_hostip_clear"
Jan 29 16:05:08 raspberrypi kernel: [ 189.121881] brcmfmac: CONSOLE: 000000.398 wl0: wlc_iovar_op: nd_hostip_clear BCME -23 (Unsupported)
Jan 29 16:05:08 raspberrypi kernel: [ 189.121888] brcmfmac: CONSOLE: 000007.284 wl0: unable to find iovar "toe_ol"
Jan 29 16:05:08 raspberrypi kernel: [ 189.121895] brcmfmac: CONSOLE: 000007.284 wl0: wlc_iovar_op: toe_ol BCME -23 (Unsupported)
Jan 29 16:05:08 raspberrypi kernel: [ 189.121902] brcmfmac: CONSOLE: 000007.294 wl0: wlc_phy_set_regtbl_on_femctrl: FIXME bt_coex
Jan 29 16:05:08 raspberrypi kernel: [ 189.121909] brcmfmac: CONSOLE: 000007.301 wl0: wlc_enable_probe_req: state down, deferring setting of host flags
Jan 29 16:05:08 raspberrypi kernel: [ 189.121917] brcmfmac: CONSOLE: 000011.476 wl0: link local addresses being set! watch out!!
Jan 29 16:05:08 raspberrypi kernel: [ 189.121921] brcmfmac: CONSOLE: 000015.290
Jan 29 16:05:08 raspberrypi kernel: [ 189.121925] brcmfmac: CONSOLE: FWID 01-4fbe0b04
Jan 29 16:05:08 raspberrypi kernel: [ 189.121929] brcmfmac: CONSOLE: flags 1
Jan 29 16:05:08 raspberrypi kernel: [ 189.121933] brcmfmac: CONSOLE: 000015.290
Jan 29 16:05:08 raspberrypi kernel: [ 189.121940] brcmfmac: CONSOLE: TRAP 4(25fd44): pc 19ce1e, lr 19b18d, sp 25fd9c, cpsr 19f, spsr 1bf
Jan 29 16:05:08 raspberrypi kernel: [ 189.121946] brcmfmac: CONSOLE: 000015.291 dfsr 8, dfar 29ca48
Jan 29 16:05:08 raspberrypi kernel: [ 189.121953] brcmfmac: CONSOLE: 000015.291 r0 234d24, r1 25ccdc, r2 8, r3 25c0d0, r4 25c0d8, r5 ff5b, r6 234d24
Jan 29 16:05:08 raspberrypi kernel: [ 189.121961] brcmfmac: CONSOLE: 000015.291 r7 25c0d0, r8 0, r9 0, r10 2, r11 18, r12 21da30
Jan 29 16:05:08 raspberrypi kernel: [ 189.121965] brcmfmac: CONSOLE: 000015.291
Jan 29 16:05:08 raspberrypi kernel: [ 189.121970] brcmfmac: CONSOLE: sp+0 0025c0a8 00000008 0019b18d 00000002
Jan 29 16:05:08 raspberrypi kernel: [ 189.121977] brcmfmac: CONSOLE: 000015.291 sp+10 0025c0a8 00000008 00234d24 00000001
Jan 29 16:05:08 raspberrypi kernel: [ 189.121980] brcmfmac: CONSOLE:
Jan 29 16:05:08 raspberrypi kernel: [ 189.121985] brcmfmac: CONSOLE: 000015.291 sp+8 0019b18d
Jan 29 16:05:08 raspberrypi kernel: [ 189.121990] brcmfmac: CONSOLE: 000015.291 sp+28 0019b2ff
Jan 29 16:05:08 raspberrypi kernel: [ 189.121995] brcmfmac: CONSOLE: 000015.291 sp+40 0019b4e9
Jan 29 16:05:08 raspberrypi kernel: [ 189.122000] brcmfmac: CONSOLE: 000015.291 sp+78 0019c853
Jan 29 16:05:08 raspberrypi kernel: [ 189.122005] brcmfmac: CONSOLE: 000015.291 sp+8c 00000985
Jan 29 16:05:08 raspberrypi kernel: [ 189.122011] brcmfmac: CONSOLE: 000015.291 sp+a8 001a692d
Jan 29 16:05:08 raspberrypi kernel: [ 189.122015] brcmfmac: CONSOLE: 000015.291 sp+cc 0000ffff
Jan 29 16:05:08 raspberrypi kernel: [ 189.122021] brcmfmac: CONSOLE: 000015.291 sp+108 00025b4b
Jan 29 16:05:08 raspberrypi kernel: [ 189.122026] brcmfmac: CONSOLE: 000015.291 sp+120 001b1e5b
Jan 29 16:05:08 raspberrypi kernel: [ 189.122031] brcmfmac: CONSOLE: 000015.291 sp+138 001d5c9d
Jan 29 16:05:08 raspberrypi kernel: [ 189.122036] brcmfmac: CONSOLE: 000015.291 sp+160 001bbdd3
Jan 29 16:05:08 raspberrypi kernel: [ 189.122041] brcmfmac: CONSOLE: 000015.291 sp+198 001c3305
Jan 29 16:05:08 raspberrypi kernel: [ 189.122046] brcmfmac: CONSOLE: 000015.291 sp+1c0 001a1bb5
Jan 29 16:05:08 raspberrypi kernel: [ 189.122051] brcmfmac: CONSOLE: 000015.291 sp+1d0 001a1bf1
Jan 29 16:05:08 raspberrypi kernel: [ 189.122056] brcmfmac: CONSOLE: 000015.291 sp+1e0 0019a50d
Jan 29 16:05:08 raspberrypi kernel: [ 189.122060] brcmfmac: CONSOLE: 000015.291 sp+1e4 0019abdd
Jan 29 16:05:11 raspberrypi kernel: [ 191.925743] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
Jan 29 16:05:11 raspberrypi kernel: [ 191.926259] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
or this
[Tue Jan 29 16:14:51 2019] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: "nd_hostip_clear"
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000001.944 wl0: wlc_iovar_op: nd_hostip_clear BCME -23 (Unsupported)
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000002.084 wl0: unable to find iovar "nd_hostip_clear"
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000002.084 wl0: wlc_iovar_op: nd_hostip_clear BCME -23 (Unsupported)
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000028.390 wl0: unable to find iovar "toe_ol"
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000028.390 wl0: wlc_iovar_op: toe_ol BCME -23 (Unsupported)
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000028.400 wl0: wlc_phy_set_regtbl_on_femctrl: FIXME bt_coex
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000028.408 wl0: wlc_enable_probe_req: state down, deferring setting of host flags
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: FWID 01-4fbe0b04
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: flags 1
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: TRAP 4(25ff10): pc 19beb0, lr 19be47, sp 25ff68, cpsr 8000019f, spsr 800001bf
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 dfsr 8, dfar deadbef0
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 r0 1, r1 1, r2 deadbeef, r3 1, r4 25c0a8, r5 8, r6 0
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 r7 25ffe8, r8 18002000, r9 18102000, r10 313da601, r11 84f14a1, r12 0
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: sp+0 0025d464 00000000 00198718 0019bf81
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 sp+10 0025c364 0019a50d 0019abdd 084f14a1
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE:
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 sp+c 0019bf81
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 sp+14 0019a50d
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 sp+18 0019abdd
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 sp+24 00199cc7
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 sp+30 000001df
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 sp+70 00008221
[Tue Jan 29 16:14:51 2019] brcmfmac: CONSOLE: 000032.764 sp+80 0019ca43
[Tue Jan 29 16:14:54 2019] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[Tue Jan 29 16:14:54 2019] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
Looks like it's more about stations that APs: now that there're many of them it takes only a few seconds before crashing.
I did a few tests to get a clearer pictures. Looks like it's something it receives (and not just the number of surrounding devices) that causes the crash.
In all the crashes I got today the pc register was 0x19ce1e
@matthiasseemoo have you got any idea/test I can do?
Thank you
You need to find out what triggers the trap. You should open the firmware in ida or another disassembler and find out what the function does where the trap happens. Or what the functions do on the path that leads to the trap.
Am Do., 31. Jan. 2019, 19:07 hat DrSchottky notifications@github.com geschrieben:
I did a few tests to get a clearer pictures. Looks like it's something it receives (and not just the number of surrounding devices) that causes the crash.
- In the less crowded environment (where i couldn't get any fw trap) I turned on a simple 802.11 fuzzer (RPI3B with Nexmon and Scapy) but almost nothing changed (still no traps, it just hangs after a few dozen mins) -In the crowded environment (where I was geeting traps after a few mins) I started airodump on a fixed channels: on some channels it traps within a min (sometimes instantly), on someone else it goes flawlessy for even 1hr before crashing or hanging (like the 1st scenario)
In all the crashes I got today the pc register was 0x19ce1e
@matthiasseemoo https://github.com/matthiasseemoo have you got any idea/test I can do?
Thank you
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seemoo-lab/nexmon/issues/280#issuecomment-459446017, or mute the thread https://github.com/notifications/unsubscribe-auth/ALP_7t00gmaagkmC0Zj7Prz3aDxlfulmks5vIzDpgaJpZM4aWoIA .
I see that routines where crashes 1 and 2 happen can be called from the routine where crash 3 happens, so I'm trying to narrow it down. Unfortunately I'm not so goot at binary reversing...
As far as I can see that big routine where crash 3 happens prints to console the string "sdpcmd_dpc: Enable", that I see is mentioned many times in your nexmon-debugger project. Is it something related to some kind of fault handler? Could you tell more about this?
Is there an update on this?
I set my 3B+ into monitor mode and tcpdump simply stops after 15 minutes...
Not that I'm aware of. Could you check if you have errors on kernel log?
the last timestamp from tcpdump: 21:58:20.909458 2454263291us this is all that dmesg displays: [ 60.690616] brcmfmac: brcmf_vif_add_validate: Attempt to add a MONITOR interface... [ 60.690637] brcmfmac: brcmf_mon_add_vif: brcmf_mon_add_vif called [ 60.690644] brcmfmac: brcmf_mon_add_vif: Adding vif "mon0" [ 91.151850] device mon0 entered promiscuous mode [ 2629.196173] device mon0 left promiscuous mode
kern.log had some older entries: Feb 18 20:12:28 raspberrypi kernel: [ 2493.322377] device mon0 entered promiscuous mode Feb 18 20:12:30 raspberrypi kernel: [ 2495.852313] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:30 raspberrypi kernel: [ 2495.852564] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:30 raspberrypi kernel: [ 2495.852571] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -110 Feb 18 20:12:33 raspberrypi kernel: [ 2498.412340] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:33 raspberrypi kernel: [ 2498.412600] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:33 raspberrypi kernel: [ 2498.412606] brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -110 Feb 18 20:12:35 raspberrypi kernel: [ 2500.972363] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:35 raspberrypi kernel: [ 2500.972615] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:35 raspberrypi kernel: [ 2500.972621] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110 Feb 18 20:12:38 raspberrypi kernel: [ 2503.532386] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:38 raspberrypi kernel: [ 2503.532637] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:41 raspberrypi kernel: [ 2506.092422] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:41 raspberrypi kernel: [ 2506.092681] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:42 raspberrypi kernel: [ 2507.263731] device mon0 left promiscuous mode Feb 18 20:12:44 raspberrypi kernel: [ 2509.772486] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:44 raspberrypi kernel: [ 2509.772984] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:44 raspberrypi kernel: [ 2509.772998] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -110 Feb 18 20:12:47 raspberrypi kernel: [ 2512.332514] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:47 raspberrypi kernel: [ 2512.333016] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:47 raspberrypi kernel: [ 2512.333031] brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -110 Feb 18 20:12:49 raspberrypi kernel: [ 2514.892538] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:49 raspberrypi kernel: [ 2514.893026] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:49 raspberrypi kernel: [ 2514.893041] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110 Feb 18 20:12:52 raspberrypi kernel: [ 2517.452558] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:52 raspberrypi kernel: [ 2517.453057] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:54 raspberrypi kernel: [ 2519.382592] device mon0 entered promiscuous mode Feb 18 20:12:54 raspberrypi kernel: [ 2520.012590] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:54 raspberrypi kernel: [ 2520.013085] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:57 raspberrypi kernel: [ 2522.572609] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:12:57 raspberrypi kernel: [ 2522.573088] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:12:57 raspberrypi kernel: [ 2522.573101] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -110 Feb 18 20:13:00 raspberrypi kernel: [ 2525.132645] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:13:00 raspberrypi kernel: [ 2525.133164] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:13:00 raspberrypi kernel: [ 2525.133178] brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -110 Feb 18 20:13:01 raspberrypi kernel: [ 2526.671610] device mon0 left promiscuous mode Feb 18 20:13:02 raspberrypi kernel: [ 2527.692668] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:13:02 raspberrypi kernel: [ 2527.693168] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:13:02 raspberrypi kernel: [ 2527.693183] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110 Feb 18 20:13:05 raspberrypi kernel: [ 2530.252683] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:13:05 raspberrypi kernel: [ 2530.253164] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:13:07 raspberrypi kernel: [ 2532.812709] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:13:07 raspberrypi kernel: [ 2532.813182] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:13:10 raspberrypi kernel: [ 2535.372739] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:13:10 raspberrypi kernel: [ 2535.373216] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:13:10 raspberrypi kernel: [ 2535.373227] brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -110 Feb 18 20:13:12 raspberrypi kernel: [ 2537.932763] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:13:12 raspberrypi kernel: [ 2537.933239] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:13:12 raspberrypi kernel: [ 2537.933251] brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -110 Feb 18 20:13:15 raspberrypi kernel: [ 2540.492786] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:13:15 raspberrypi kernel: [ 2540.493259] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:13:15 raspberrypi kernel: [ 2540.493271] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110 Feb 18 20:13:18 raspberrypi kernel: [ 2543.052816] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:13:18 raspberrypi kernel: [ 2543.053313] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle Feb 18 20:13:20 raspberrypi kernel: [ 2545.612836] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout Feb 18 20:13:20 raspberrypi kernel: [ 2545.613311] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
I bet that if you scroll up a little more you'll see the Unknown mailbox data content: 0x40012
error.
Anyway the only way to recover from a trap w/o rebooting is to power cycle the chipset by rebinding the mmc driver.
Are there ad-hoc/mesh networks or BLE beacons neraby?
<Are there ad-hoc/mesh networks or BLE beacons neraby?> wouldn't know how to tell...
I found this: Feb 18 19:47:49 raspberrypi kernel: [ 1014.497255] brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0x40012
rebooting works but I need the monitor mode 24/7
For ad-hoc you can run a wifi scan and look for IBSS. lescan tool for BLE. It's something that it receives that causes the crash (yes, I tested it) and whatever you'll do it'll keep crashing until the frame will be in the air.
Maybe you can dodge it by changing channel (no hopping ofc).
Power cycling the chipset is faster than rebooting, but you can't have a 100% uptime. Besides the trap the fw has many other issues that make it almost unusable for a continuos monitoring. I gave up and moved to a USB dongle.
would the older PI 3B (no +) be a better choice?
Sure, nexmon works like a charm on 3B
let's get one of those then...
I see exactly the same.. after a random time it crashes with "Unknown mailbox data content: 0x40012 error." Reboot to revive and after 5-60 min it crashes again ..
I think we should get used to it as long as nobody finds the root cause.
here is a excerpt from kern.log when mine (latest Nexmon on 4.14.98-v7+) crashes while in monitor mode and switching channels.
May 29 00:38:57 raspberrypi kernel: [ 1592.594188] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do.
May 29 00:38:57 raspberrypi kernel: [ 1592.594600] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do.
May 29 00:38:57 raspberrypi kernel: [ 1592.594609] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-5)
May 29 00:38:58 raspberrypi kernel: [ 1593.758731] brcmfmac: brcmf_sdio_htclk: HT Avail timeout (1000000): clkctl 0x00
May 29 00:38:58 raspberrypi kernel: [ 1593.762909] brcmfmac: brcmf_sdiod_regrw_helper: failed to write data F1@0x0c024, err: -110
May 29 00:38:58 raspberrypi kernel: [ 1593.767259] brcmfmac: brcmf_sdiod_regrw_helper: failed to write data F1@0x0c020, err: -110
May 29 00:38:59 raspberrypi kernel: [ 1594.779545] brcmfmac: brcmf_sdio_htclk: HT Avail timeout (1000000): clkctl 0x00
May 29 00:38:59 raspberrypi kernel: [ 1594.818325] brcmfmac: brcmf_sdiod_regrw_helper: failed to read data F1@0x0a408, err: -110
May 29 00:38:59 raspberrypi kernel: [ 1594.822745] brcmfmac: brcmf_sdiod_regrw_helper: failed to read data F1@0x0a800, err: -110
May 29 00:38:59 raspberrypi kernel: [ 1594.826192] brcmfmac: brcmf_sdiod_regrw_helper: failed to write data F1@0x0a408, err: -110
May 29 00:38:59 raspberrypi kernel: [ 1594.830352] brcmfmac: brcmf_sdiod_regrw_helper: failed to read data F1@0x0a408, err: -110
May 29 00:38:59 raspberrypi kernel: [ 1594.834131] brcmfmac: brcmf_sdiod_regrw_helper: failed to read data F1@0x0a800, err: -110
There's a new FW release from Cypress. Maybe I'll give it a try in the next days
Let me know, if you need my IDA file, I can share the one from the last fw version with you. Just drop me an email.
DrSchottky notifications@github.com schrieb am Do., 30. Mai 2019, 23:08:
There's a new FW release from Cypress. Maybe I'll give it a try in the next days
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seemoo-lab/nexmon/issues/280?email_source=notifications&email_token=ACZ773RXCYRVGCRQOC6MNQTPYA64RA5CNFSM4GS2QIAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWTQXNI#issuecomment-497486773, or mute the thread https://github.com/notifications/unsubscribe-auth/ACZ773RAN4CMJX5T2BK43U3PYA64RANCNFSM4GS2QIAA .
@matthiasseemoo thank you but I should have the annotated IDBs for 154 and 173 from my last porting. Today I updated definitions.mk for 189 and merged nexmon's brcmfmac patches with the latest patches from Cypress (works fine with 189 stock and 154 nexmon FWs). I hope to be able to port all the remaining offsets by this weekend.
Sounds good 👍
DrSchottky notifications@github.com schrieb am Fr., 7. Juni 2019, 20:46:
@matthiasseemoo https://github.com/matthiasseemoo thank you but I should have the annotated IDBs for 154 and 173 from my last porting. Today I updated definitions.mk for 189 and merged nexmon's brcmfmac patches with the latest patches from Cypress (works fine with 189 stock and 154 nexmon FWs). I hope to be able to port all the remaining offsets by this weekend.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seemoo-lab/nexmon/issues/280?email_source=notifications&email_token=ACZ773UBF7AFEJO7IDNIMMLPZKUIXA5CNFSM4GS2QIAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXGVVRY#issuecomment-499997383, or mute the thread https://github.com/notifications/unsubscribe-auth/ACZ773QT4NQYNYJ3MJ3CAPTPZKUIXANCNFSM4GS2QIAA .
Let me know if you need some additional testing - I've got two of these B+'s on the bench right now.
Bad news: nexmon works fine on 189 but I got a trap after ~20mins Here's the console log
[ 1219.676669] brcmfmac: brcmf_sdio_hostmail: mailbox indicates firmware halted
[ 1219.677178] brcmfmac: CONSOLE: 2.295 wl0: link local addresses being set! watch out!!
[ 1219.677186] brcmfmac: CONSOLE: 000046.659 wl0: unable to find iovar "nd_hostip_clear"
[ 1219.677194] brcmfmac: CONSOLE: 000046.659 wl0: wlc_iovar_op: nd_hostip_clear BCME -23 (Unsupported)
[ 1219.677201] brcmfmac: CONSOLE: 000052.993 wl0: unable to find iovar "toe_ol"
[ 1219.677208] brcmfmac: CONSOLE: 000052.993 wl0: wlc_iovar_op: toe_ol BCME -23 (Unsupported)
[ 1219.677215] brcmfmac: CONSOLE: 000053.005 wl0: wlc_phy_set_regtbl_on_femctrl: FIXME bt_coex
[ 1219.677222] brcmfmac: CONSOLE: 000257.051 wl0: dma_rx: bad frame length (1794)
[ 1219.677229] brcmfmac: CONSOLE: 000410.554 wl0: dma_rx: bad frame length (1854)
[ 1219.677235] brcmfmac: CONSOLE: 000607.867 wl0: dma_rx: bad frame length (1882)
[ 1219.677242] brcmfmac: CONSOLE: 000912.299 wl0: dma_rx: bad frame length (1758)
[ 1219.677247] brcmfmac: CONSOLE: 001213.387
[ 1219.677252] brcmfmac: CONSOLE: FWID 01-e1db26e2
[ 1219.677257] brcmfmac: CONSOLE: flags 1
[ 1219.677262] brcmfmac: CONSOLE: 001213.387
[ 1219.677270] brcmfmac: CONSOLE: TRAP 4(25fed4): pc 1a14be, lr 19b159, sp 25ff2c, cpsr 19f, spsr 1bf
[ 1219.677276] brcmfmac: CONSOLE: 001213.387 dfsr 8, dfar 29ca88
[ 1219.677286] brcmfmac: CONSOLE: 001213.387 r0 233e50, r1 25ccdc, r2 8, r3 25c0d0, r4 25c0d8, r5 ff6b, r6 19870c
[ 1219.677294] brcmfmac: CONSOLE: 001213.387 r7 25c0d0, r8 0, r9 0, r10 b53ab9a7, r11 2dcf3cf5, r12 22e95c
[ 1219.677299] brcmfmac: CONSOLE: 001213.387
[ 1219.677306] brcmfmac: CONSOLE: sp+0 0025c0a8 00000000 0019b159 00000002
[ 1219.677313] brcmfmac: CONSOLE: 001213.387 sp+10 0025a734 00000000 0019870c 00198718
[ 1219.677317] brcmfmac: CONSOLE:
[ 1219.677323] brcmfmac: CONSOLE: 001213.387 sp+8 0019b159
[ 1219.677328] brcmfmac: CONSOLE: 001213.387 sp+28 00007f43
[ 1219.677334] brcmfmac: CONSOLE: 001213.387 sp+2c 00007f19
[ 1219.677339] brcmfmac: CONSOLE: 001213.387 sp+38 00007f95
[ 1219.677345] brcmfmac: CONSOLE: 001213.387 sp+48 0019a4e5
[ 1219.677351] brcmfmac: CONSOLE: 001213.387 sp+50 0019aba9
[ 1219.677357] brcmfmac: CONSOLE: 001213.387 sp+60 00199c47
[ 1219.677363] brcmfmac: CONSOLE: 001213.387 sp+6c 000001df
[ 1219.677368] brcmfmac: CONSOLE: 001213.387 sp+ac 00008221
[ 1219.677374] brcmfmac: CONSOLE: 001213.387 sp+bc 0019ca0f
[ 1222.308293] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[ 1222.311867] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
I can't tell you if is more, less or stable as 154. I think I'm not experienced enough to write by myself a fw patch to circumvent it (it's an oob read), but if someone thinks he'll be able to do it I can share with him idb, offsets and a remote test environment.
@mlinton are you able reproduce anytime the crash? Could you possibly setup your Pis with 154 and my 189 to compare them?
Any updates on this issue?
@conor-f Unfortunately no. The crash happens in pktq_pdeq function and the bug triggers when it tries to dequeue a malformed packet. Just skipping it seems not enough, and I don't know where/how is enqueued
Oh :/ I presume there's no way to detect a malformed packet and not attempt to decode it through the patch as this is an actual firmware issue? Any ways to filter them at all? I'm having to delete and readd the interface, remove and re-insert the kernel patch or even reboot every few minutes on my pi 3B+
Sure there's a way, but it goes beyond my skills. To date I haven't found a way to prevent crashes or to recover w/o power cycling the chipset.
Did you try the newest Cypress firmware 7.45.189? I think I forgot to submit the wrapper.c file, but now everything should be available.
If dequeuing a frame fails, find the function that queues frames and hook it. To see who called this function, dump the link register.
Unfortunately, I assume that the brcmfmac driver is not cleanly reinitializing the chip or the SDIO driver, hence, a reboot is required to fix those issues.
On Wed, Jul 24, 2019 at 6:55 PM DrSchottky notifications@github.com wrote:
Sure there's a way, but it goes beyond my skills. To date I haven't found a way to prevent crashes or to recover w/o power cycling the chipset.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seemoo-lab/nexmon/issues/280?email_source=notifications&email_token=ACZ773R7TBDF27QH727QI43QBCCRVA5CNFSM4GS2QIAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2W6UBY#issuecomment-514714119, or mute the thread https://github.com/notifications/unsubscribe-auth/ACZ773VMUHAWWFJAWTJG5ZDQBCCRVANCNFSM4GS2QIAA .
-- Matthias Schulz Secure Mobile Networking Lab - SEEMOO
Email: matthias.schulz@seemoo.tu-darmstadt.de Web: http://www.seemoo.de/mschulz Phone (new): +49 6151 16-25478 Fax: +49 6151 16-25471
Department of Computer Science Center for Advanced Security Research Darmstadt Technische Universität Darmstadt Mornewegstr. 32 (Office 4.2.10, Building S4/14) D-64293 Darmstadt, Germany
I used 7_45_189 from master as it was about 2 weeks ago on kernel 4.19. Hmm doing that will be quite difficult for me but I'll try!
@matthiasseemoo I tried and got crashes and yes, I should locate pktq_penq inside ramfile. Afaik once crashed the chipset won't reload the ramfile (i.e. rmmod + modprobe won't fix it), but you can recover it w/o rebooting by resetting the bcm.
@conor-f could you check if you get crashes with interface in monitor mode but w/o radiotap (MONITOR_IEEE80211) ?
Mine seems not crashing in that mode, and curiously wl_monitor_radiotap pushes into a skb (the same data structure involved in the crash)
Interesting. Can you give me some details on how to do that so I can test it too?
Activate mon0 as usual and run
nexutil -m1
You can use tcpdump to see the incoming data
So I've had this running for the last half hour on a pi that would fail every 5 minutes or so and it looks steadier but still fails. Here is some of the dmesg output if it's insightful.
Also when I ran it without Radiotap headers I couldn't get any information that I'm interested in - namely MAC address and signal strength.
I tried with two identical setup (one w/ radiotap and the other w/o) in the same place. The first crashed as usual, the second sometimes "freezed" but it can be woken up by running wifi-related commands or by reloading brcmfmac.
@conor-f Yes, I know it' pretty useless w/o radiotap, I'm just trying to find where the bug is.
Okay cool. I didn't try manually restart the interface or anything when I tested it without radiotap headers on. Thank you for looking more into this!
Just to let you know, there is a difference in how frames are handled on the bcm43455 compared to other chips with nexmon patches. Normally, I created a new buffer for the monitor mode frame and copied the original frame into this buffer and then sent it up to the host. However, on the bcm43455 this approach did not work. Hence, I just take the received frame and send it up, which seems to sometimes lead to problems.
Conor Flynn notifications@github.com schrieb am Fr., 26. Juli 2019, 12:28:
Okay cool. I didn't try manually restart the interface or anything when I tested it without radiotap headers on. Thank you for looking more into this!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seemoo-lab/nexmon/issues/280?email_source=notifications&email_token=ACZ773TADN6JC6N4CXGKJETQBLGVBA5CNFSM4GS2QIAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD24GGMI#issuecomment-515400497, or mute the thread https://github.com/notifications/unsubscribe-auth/ACZ773TITAP533KBF2FBOH3QBLGVBANCNFSM4GS2QIAA .
@matthiasseemoo I've seen that crashes happen only if the radiotap frame is pushed into skb: if there's only the pull to remove PLCP header (as in 80211 mode) the chips the chip just goes into a semi-hang state. Since crashes happen at the same time on different systems in the same place I think that it's likely something unexpected it receives, so I'm thinking about inspecting frames in wl_monitor_hook to see if there's something unusual. What do you think about it? Do you have any idea what I can try?
Just try it. You can also analyze the frame size, it could bee that too large frame buffers lead to a problem on the sdio link. There is also another option not to attach the radiotap header to the original frame but to send an additional frame with only that header. Those frames would need to be reassembled on the host. For example in the libnexmon by hooking the approptiate functions to make it transparent to applications.
DrSchottky notifications@github.com schrieb am Fr., 26. Juli 2019, 17:44:
@matthiasseemoo https://github.com/matthiasseemoo I've seen that crashes happen only if the radiotap frame is pushed into skb: if there's only the pull to remove PLCP header (as in 80211 mode) the chips the chip just goes into a semi-hang state. Since crashes happen at the same time on different systems in the same place I think that it's likely something unexpected it receives, so I'm thinking about inspecting frames in wl_monitor_hook to see if there's something unusual. What do you think about it? Do you have any idea what I can try?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seemoo-lab/nexmon/issues/280?email_source=notifications&email_token=ACZ773WWMSBBRZHZGMBP32TQBMLUTA5CNFSM4GS2QIAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD247G6Q#issuecomment-515502970, or mute the thread https://github.com/notifications/unsubscribe-auth/ACZ773SYBSZRSZDKRZRKIX3QBMLUTANCNFSM4GS2QIAA .
I collected a bunch of crash logs containing frames dump
brcmfmac: brcmf_sdio_hostmail: mailbox indicates firmware halted
brcmfmac: CONSOLE: 992, data:0021ff2e, len:138, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 004190.026 field0:16842761, field4:-270401992, data:0021ff2c, len:20, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 004190.033 field0:16842761, field4:-270401992, data:0021ff2c, len:20, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 004190.033 field0:16842761, field4:-270401992, data:0021ff2e, len:168, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 004190.034 field0:16842761, field4:-270401992, data:0021ff2e, len:138, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 004190.036 field0:16842761, field4:-270401992, data:0021ff2e, len:185, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 004190.041 field0:65644, field4:283328928, data:002341a0, len:136, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 004190.042
brcmfmac: CONSOLE: FWID 01-e1db26e2
brcmfmac: CONSOLE: flags 1
brcmfmac: CONSOLE: 004190.042
brcmfmac: CONSOLE: TRAP 4(25fd1c): pc 1a14be, lr 19b159, sp 25fd74, cpsr 19f, spsr 1bf
brcmfmac: CONSOLE: 004190.042 dfsr 8, dfar 29ca28
brcmfmac: CONSOLE: 004190.042 r0 234168, r1 25ccdc, r2 8, r3 25c0d0, r4 25c0d8, r5 ff53, r6 234168
brcmfmac: CONSOLE: 004190.042 r7 25c0d0, r8 0, r9 0, r10 2, r11 18, r12 0
brcmfmac: CONSOLE: 004190.042
brcmfmac: CONSOLE: sp+0 0025c0a8 00000008 0019b159 00000002
brcmfmac: CONSOLE: 004190.042 sp+10 0025c0a8 00000008 00234168 00000001
brcmfmac: CONSOLE:
brcmfmac: CONSOLE: 004190.042 sp+8 0019b159
brcmfmac: CONSOLE: 004190.042 sp+28 0019b2cb
brcmfmac: CONSOLE: 004190.042 sp+40 0019b4b5
brcmfmac: CONSOLE: 004190.042 sp+78 0019c81f
brcmfmac: CONSOLE: 004190.042 sp+ac 00000985
brcmfmac: CONSOLE: 004190.042 sp+c8 001ab4f5
brcmfmac: CONSOLE: 004190.042 sp+e4 001ef9e9
brcmfmac: CONSOLE: 004190.042 sp+ec 0000ffff
brcmfmac: CONSOLE: 004190.042 sp+128 00025b4b
brcmfmac: CONSOLE: 004190.042 sp+140 001b6b07
brcmfmac: CONSOLE: 004190.042 sp+158 001db367
brcmfmac: CONSOLE: 004190.042 sp+180 001c1c53
brcmfmac: CONSOLE: 004190.042 sp+1b8 001c87f1
brcmfmac: CONSOLE: 004190.042 sp+1e0 001a6939
brcmfmac: CONSOLE: 004190.042 sp+1f0 001a6975
brcmfmac: CONSOLE: 004190.042 sp+200 0019a4d9
brcmfmac: brcmf_sdio_hostmail: mailbox indicates firmware halted
brcmfmac: CONSOLE: ev3:0
brcmfmac: CONSOLE: 000006.742 field0:16842858, field4:-270308856, data:00236afe, len:185, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000006.745 field0:16842848, field4:-270289016, data:0023b87e, len:188, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000006.754 field0:16842776, field4:-270431752, data:00218aee, len:168, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000006.767 field0:16842829, field4:-270251320, data:00244bbe, len:189, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000006.770 field0:65643, field4:216220096, data:002341c0, len:104, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000006.770
brcmfmac: CONSOLE: FWID 01-e1db26e2
brcmfmac: CONSOLE: flags 1
brcmfmac: CONSOLE: 000006.770
brcmfmac: CONSOLE: TRAP 4(25fd1c): pc 1a14be, lr 19b159, sp 25fd74, cpsr 19f, spsr 1bf
brcmfmac: CONSOLE: 000006.770 dfsr 8, dfar 29caa8
brcmfmac: CONSOLE: 000006.770 r0 234188, r1 25ccdc, r2 8, r3 25c0d0, r4 25c0d8, r5 ff73, r6 234188
brcmfmac: CONSOLE: 000006.771 r7 25c0d0, r8 0, r9 0, r10 2, r11 18, r12 0
brcmfmac: CONSOLE: 000006.771
brcmfmac: CONSOLE: sp+0 0025c0a8 00000008 0019b159 00000002
brcmfmac: CONSOLE: 000006.771 sp+10 0025c0a8 00000008 00234188 00000001
brcmfmac: CONSOLE:
brcmfmac: CONSOLE: 000006.771 sp+8 0019b159
brcmfmac: CONSOLE: 000006.771 sp+28 0019b2cb
brcmfmac: CONSOLE: 000006.771 sp+40 0019b4b5
brcmfmac: CONSOLE: 000006.771 sp+78 0019c81f
brcmfmac: CONSOLE: 000006.771 sp+ac 00000985
brcmfmac: CONSOLE: 000006.771 sp+c8 001ab4f5
brcmfmac: CONSOLE: 000006.771 sp+d8 00001001
brcmfmac: CONSOLE: 000006.771 sp+ec 0000ffff
brcmfmac: CONSOLE: 000006.771 sp+128 00025b4b
brcmfmac: CONSOLE: 000006.771 sp+140 001b6b07
brcmfmac: CONSOLE: 000006.771 sp+158 001db367
brcmfmac: CONSOLE: 000006.771 sp+180 001c1c53
brcmfmac: CONSOLE: 000006.771 sp+1b8 001c87f1
brcmfmac: CONSOLE: 000006.771 sp+1e0 001a6939
brcmfmac: CONSOLE: 000006.771 sp+1f0 001a6975
brcmfmac: CONSOLE: 000006.771 sp+200 0019a4d9
brcmfmac: brcmf_sdio_hostmail: mailbox indicates firmware halted
brcmfmac: CONSOLE: ata:0023a13c, len:85, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000019.239 field0:16842783, field4:-270445640, data:002154ac, len:20, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000019.246 field0:16842760, field4:-268856420, data:00199492, len:189, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000019.257 field0:16842781, field4:-270441672, data:0021642e, len:189, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000019.267 field0:16842857, field4:-270306872, data:002372be, len:188, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000019.280 field0:16842763, field4:-270405960, data:0021efae, len:191, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000019.281 field0:16842782, field4:-270443656, data:00215c6e, len:191, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000019.283 field0:65643, field4:232997304, data:002341b8, len:112, fieldE:0, field10:0, next:0, prev:0, prev2:0, prev3:0
brcmfmac: CONSOLE: 000019.283
brcmfmac: CONSOLE: FWID 01-e1db26e2
brcmfmac: CONSOLE: flags 1
brcmfmac: CONSOLE: 000019.283
brcmfmac: CONSOLE: TRAP 4(25fd1c): pc 1a14be, lr 19b159, sp 25fd74, cpsr 19f, spsr 1bf
brcmfmac: CONSOLE: 000019.283 dfsr 8, dfar 29ca88
brcmfmac: CONSOLE: 000019.283 r0 234180, r1 25ccdc, r2 8, r3 25c0d0, r4 25c0d8, r5 ff6b, r6 234180
brcmfmac: CONSOLE: 000019.283 r7 25c0d0, r8 0, r9 0, r10 2, r11 18, r12 0
brcmfmac: CONSOLE: 000019.283
brcmfmac: CONSOLE: sp+0 0025c0a8 00000008 0019b159 00000002
brcmfmac: CONSOLE: 000019.283 sp+10 0025c0a8 00000008 00234180 00000001
brcmfmac: CONSOLE:
brcmfmac: CONSOLE: 000019.283 sp+8 0019b159
brcmfmac: CONSOLE: 000019.283 sp+28 0019b2cb
brcmfmac: CONSOLE: 000019.283 sp+40 0019b4b5
brcmfmac: CONSOLE: 000019.283 sp+78 0019c81f
brcmfmac: CONSOLE: 000019.283 sp+ac 00000985
brcmfmac: CONSOLE: 000019.283 sp+c8 001ab4f5
brcmfmac: CONSOLE: 000019.283 sp+d8 00001001
brcmfmac: CONSOLE: 000019.283 sp+ec 0000ffff
brcmfmac: CONSOLE: 000019.283 sp+128 00025b4b
brcmfmac: CONSOLE: 000019.283 sp+140 001b6b07
brcmfmac: CONSOLE: 000019.283 sp+158 001db367
brcmfmac: CONSOLE: 000019.283 sp+180 001c1c53
brcmfmac: CONSOLE: 000019.283 sp+1b8 001c87f1
brcmfmac: CONSOLE: 000019.283 sp+1e0 001a6939
brcmfmac: CONSOLE: 000019.283 sp+1f0 001a6975
brcmfmac: CONSOLE: 000019.283 sp+200 0019a4d9
As you can see the last frame looks strange (btw I have no idea where it gets corrupted), so I'm just ignoring frames with an unusual field0 value. Crashes disappeard, but sometimes the interface hangs or drops most of frames.
hmm super weird! Is there a patch or anything which I could use to test out that coarse filtering? I wrote some tooling around my nexmon stuff which checks if the interface is still receiving any packets and if not it just tears down the interface and recreates it. It's a pretty quick process but still needless and feels hacky so I really appreciate the effort to find the root cause from everyone :)
Monitor mode on 3B+ in prone to crashing. I played around with patches, different versions of driver and FW but always got the same result: after a certain amount of time the chipset stops responing. As far as I can see the time before the crash depends on how much "crowded" the surrounding environment is: the more APs/Stations there are, the faster it crashes (it goes from 50-60mins at my home to 1-3mins at office). Even the error is slightly different: in less crowded environments it just stops responding (error -110), in more crowded ones it thows a
Unknown mailbox data content: 0x40012
followed by a FW trap error and then it times out. Whilst you can recover from the first error by reloading the driver the latest requires a power cycle of the chipset.What I tried so far
Different boards
Different kernels
Different driver versions
Different FW versions ( I ported nexmon to 7.45.173)
and nothing changed.
Steps to reproduce the issue:
Build and install driver+fw
Add mon0
Run airodump-ng
Wait until it crashes
I don't think it's something related to patches, since even without hooking wl_monitor (using the built-in monitor mode) it crashes, probably even earlier. Since crashes are reproducible and happens approximately after the same amount of time I'm prone to think that something somewhere is overflowing.
Unfortunately I have zero clue about how to debug this kind of issues. If anyone has suggestions they're appreciated.