seemoo-lab / nexmon_csi

Channel State Information Extraction on Various Broadcom Wi-Fi Chips
305 stars 121 forks source link

Inconsistent CSI collection in Raspberry Pi 4B, kernel 5.4 and firmware 7.45.189 #177

Closed maurovm closed 3 years ago

maurovm commented 3 years ago

I want to collect CSI packets at regular intervals. However, I found several bottlenecks that I hope you can help me.

Wifi router:

Traffic generation:

The computer that generates packets, let's call it "PI_TX" here for simplicity, has the following specs:

I had not have any issues following the instructions and install the required software. After rebooting, the computer loads the correct module, as the output ofd the following command:

$ dmesg | grep -i brcmfmac

Show the module version 7.45.189:

...
[    5.185108] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4345/6 wl0: Feb  2 2021 11:51:27 version 7.45.189 (nexmon.org: 2.2.2-329-g5dbc-2) FWID 01-e1db26e2
...

The machine can successfully connect to the wifi router, ping any machine in the network, browse the internet, etc.

The output of the "iw dev" command is (I replaced my MAC address and name of my network):

phy#0
    Unnamed/non-netdev interface
        wdev 0x2
        addr MY_MAC_ADDRESS
        type P2P-device
        txpower 31.00 dBm
    Interface wlan0
        ifindex 4
        wdev 0x1
        addr MY_WIFI_ROUTER_ADDRESS
        ssid MY_NETWORK_SSID
        type managed
        channel 36 (5180 MHz), width: 80 MHz, center1: 5210 MHz
        txpower 31.00 dBm

Which shows the correct channel and bandwidth.

To generate traffic, I ping either the wifi router or another machine in the network with:

$ sudo ping -i 0.5 IP_OF_TARGET_MACHINE

or:

iperf3 -i 0.5 -u -t 0 -c IP_OF_TARGET_MACHINE

Both of the commands above work

CSI collection:

The computer that will collect CSI packets, let's call it "PI_RX" here for simplicity, has the following specs:

After each reboot, run the following command to monitor the network:

$ ifconfig wlan0 up
$ makecsiparams -c 36/80 -C 1 -N 1 -m mac_of_PI_TX -b 0x88
$ nexutil -Iwlan0 -s500 -b -l34 -vTHE_STRING_FROM_MAKE_CSI_PARAM_ABOVE

I veriy that the wifi card was correctly set. The command:

$ nexutil -k

gives me the result:

chanspec: 0xe02a, 36/80

Enable monitoring mode with:

$ iw phy `iw dev wlan0 info | gawk '/wiphy/ {printf "phy" $2}'` interface add mon0 type monitor
$ ifconfig mon0 up

The output of the "iw dev" command is (I replaced my MAC address and name of my network):

phy#0
    Interface mon0
        ifindex 4
        wdev 0x2
        addr dc:a6:32:40:52:ae
        type monitor
        channel 36 (5180 MHz), width: 80 MHz, center1: 5210 MHz
        txpower 31.00 dBm
    Interface wlan0
        ifindex 3
        wdev 0x1
        addr dc:a6:32:40:52:ae
        type managed
        channel 36 (5180 MHz), width: 80 MHz, center1: 5210 MHz
        txpower 31.00 dBm

I attempted to Collect CSI traffic with the following command:

$ tcpdump -i wlan0 dst port 5500

Description of the problem:

There are some inconsistent behaviour when trying to Collect CSI. I have read the other issues raised in the "nexmon" and "nexmon_csi" repositories but could not solve my problems.

Problem 1: After I start the "tcpdump" command in the PI_RX machine, I have to run (in the PI_TX machine):

Problem 2: The packages I receive in the PI_RX machine are in the format:

TIME_STAMP IP 10.10.10.10.5500 > 255.255.255.255.5500: UDP, length 1042

Is this correct?, I would have expected to see the IP address of either the PI_TX or target machine. Several of the packages have empty contents.

Problem 3: Several packages are lost. The number of lost packages is not consistent, even when I attempt to run the experiments under similar conditions. An example of the output of tcpdump is:

1453 packets captured
1774 packets received by filter
313 packets dropped by kernel
59 packets dropped by interface

I am also surprised about the differences between packets captured and packets received. I only had the wifi router, the PI_RX and PI_TX machines connected in an empty room that resembles a Faraday cage. I attempted to stop any service, daemons or other software that is not needed in the Rasberry Pi. Maybe there are other kernel and network traffic generated?

Problem 4: After collecting CSI for a few rounds, if I attempt to run any command that checks the state the wlan0 interface (such as "iw dev" or "nexutil -k",...), the commands take a long time to finish and the wifi configuration get's broken. For example, "iw dev" results in (note that I replaced my MAC address intentionally):

phy#0
    Interface mon0
        ifindex 4
        wdev 0x2
        addr MY_MAC_ADDRESS
        type monitor
    Interface wlan0
        ifindex 3
        wdev 0x1
        addr MY_MAC_ADDRESS
        type managed

The command "nexutil -k" results in

chanspec: 0x6863, 85/160

"dmesg" has some logs similar to:

...
[ 6019, 996760] ieee80211 phy0: brcmf_cfg80211_get_channel: chanspec failed (-110)
...
[ 6022,556790] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
...

The output of forensics is:

```shell
$ cat /sys/kernel/debug/ieee80211/$(iw dev wlan0 info | gawk '/wiphy/ {printf "phy" $2}')/forensics

dongle trap info: type 0x4 @ epc 0x001a9832
  cpsr 0x0000019f spsr 0x000001bf sp 0x0025fd70
  lr   0x001a9821 pc   0x001a9832 offset 0x25fd18
  r0   0x0024d64c r1   0x00000000 r2 0x00008b3c r3 0x0025ccdc
  r4   0x00235b14 r5   0x00229d24 r6 0x00000000 r7 0x00000040

000000.205 wl0: wlc_channels_commit: no valid channel for "#n" nbands 2 bandlocked 0
000000.206 wl0: Broadcom BCM4345 802.11 Wireless Controller 7.45.189 (r714228 CY)
000000.206 TCAM: 256 used: 237 exceed:0
000000.207 reclaim section 1: Returned 118076 bytes to the heap
000000.207 reclaim section 4: Returned 44 bytes to the heap
000000.207 sdpcmd_dpc: Enable
000000.233 wl0: wlc_iovar_op: txbf BCME -23 (Unsupported)
000000.235 wl0: unable to find iovar "rsdb_mode"
000000.235 wl0: wlc_iovar_op: rsdb_mode BCME -23 (Unsupported)
000000.236 wl0: unable to find iovar "tdls_enable"
000000.236 wl0: wlc_iovar_op: tdls_enable BCME -23 (Unsupported)
004681.992 wl0: unable to find iovar "toe_ol"
004681.992 wl0: wlc_iovar_op: toe_ol BCME -23 (Unsupported)
004681.993 wl0: wl_open
004682.004 wl0: wlc_phy_set_regtbl_on_femctrl: FIXME bt_coex
004682.021 wl0: unable to find iovar "nd_hostip_clear"
004682.021 wl0: wlc_iovar_op: nd_hostip_clear BCME -23 (Unsupported)
004715.721 wl0: wlc_phy_set_regtbl_on_femctrl: FIXME bt_coex
004746.685 wl0: unable to find iovar "toe_ol"
004746.685 wl0: wlc_iovar_op: toe_ol BCME -23 (Unsupported)
004924.788 unexpected new csi, clearing old
005113.136 
FWID 01-e1db26e2
flags 1
005113.136 
TRAP 4(25fd18): pc 1a9832, lr 1a9821, sp 25fd70, cpsr 19f, spsr 1bf
005113.136   dfsr 409, dfar 27f9cc
005113.136   r0 24d64c, r1 0, r2 8b3c, r3 25ccdc, r4 235b14, r5 229d24, r6 0
005113.136   r7 40, r8 0, r9 0, r10 0, r11 c, r12 235c30
005113.136 
   sp+0 00000025 00242b78 ffffe800 00000001
005113.136   sp+10 0025d560 000040d7 002363a8 0000000f

005113.136 sp+14 000040d7
005113.136 sp+44 001cd627
005113.136 sp+54 001a34f7
005113.136 sp+64 001a6f05
005113.136 sp+a4 001a72bd
005113.136 sp+cc 001ab4f5
005113.136 sp+f0 0000ffff
005113.136 sp+12c 00025b4b
005113.136 sp+144 001b6b07
005113.136 sp+184 001c1c47
005113.136 sp+1bc 001c87f1
005113.136 sp+1e4 001a6939
005113.136 sp+1f4 001a6975
005113.136 sp+204 0019a4d9
005113.136 sp+20c 0019aba9
005113.136 sp+21c 00199c47

This problems occurs very often. The only way to solve it is to reboot the machine.

Thank you for all your help

maurovm commented 3 years ago

Adding to the description of my problem. From time to time, CSI collection could not even start. When executing "tcpdump", I get the following kernel error (shown by dmesg):

[  156.362831] device wlan0 entered promiscuous mode
[  158.872825] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[  158.876305] ieee80211 phy0: _brcmf_set_multicast_list: Setting mcast_list failed, -110
[  161.432816] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[  161.436293] ieee80211 phy0: _brcmf_set_multicast_list: Setting allmulti failed, -110
[  163.992809] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[  163.996285] ieee80211 phy0: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
[  166.552800] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[  169.112819] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout

Any ideas why it might happened?

maurovm commented 3 years ago

Other times, I can collect CSI for a few minutes. Afterwards, it crashes with the following kernel error (as reported by dmesg):

[  149.307285] device wlan0 entered promiscuous mode
[  188.962074] ieee80211 phy0: brcmf_fw_crashed: Firmware has halted or crashed
[  264.287049] device wlan0 left promiscuous mode
[  266.797301] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[  266.800779] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
jlinktu commented 3 years ago

Problem 1: Not sure about this one. Btw. there is no need to use the nexmon firmware on PI_TX if you are just connecting to an AP and use ping or iperf to generate traffic.

Problem 2: This is correct. Please have a second look at the readme.

Problem 3: Traffic generated by ping or iperf does not necessarily translate to constant Wi-Fi traffic. Have a look at the tcpdump manual for explanations of those numbers and options to control them.

Problem 4: This might be related to AMSDUs, please try this fix and report back.

maurovm commented 3 years ago

Dear Jakob,

Thanks for the suggestions, they helped a lot. Please see below my answers:

Problem 1: Not sure about this one. Btw. there is no need to use the nexmon firmware on PI_TX if you are just connecting to an AP and use ping or iperf to generate traffic.

The problem is that my MacBook does not tag frames with 0x88. Therefore, if I do:

$ makecsiparams -c 36/80 -C 1 -N 1 -m MAC_ADDRESS

I get traffic from the laptop. However, I need to add "-b 0x88" if I use a Raspberry Pi to generate traffic.

Problem solved.

Problem 2: This is correct. Please have a second look at the readme.

Thank you, problem solved.

Problem 3: Traffic generated by ping or iperf does not necessarily translate to constant Wi-Fi traffic. Have a look at the tcpdump manual for explanations of those numbers and options to control them.

Got it, problem solved.

Problem 4: This might be related to AMSDUs, please try this fix and report back.

The fix you pointed out (by mzakharo) worked really well in my setup. I am now able to collect CSI data more consistently. I started/stoped CSI collection several times and left it for 30 minutes without any issues so far. Dropped packages are far less than what I had before too, which is wonderful. However, I noticed a kernel error reporting an incorrect file "brcmfmac43455-sdio.raspberrypi,4-model-b.txt":

[    4.784266] brcmfmac: F1 signature read @0x18000000=0x15264345
[    4.797892] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43455-sdio for chip BCM4345/6
[    4.798461] usbcore: registered new interface driver brcmfmac
[    4.826801] brcmfmac mmc1:0001:1: Direct firmware load for brcm/brcmfmac43455-sdio.raspberrypi,4-model-b.txt failed with error -2
[    5.213790] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43455-sdio for chip BCM4345/6
[    5.224916] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4345/6 wl0: Feb  3 2021 11:06:42 version 7.45.189 (nexmon.org/csi: 55dc-2)
[    5.270728] ieee80211 phy0: brcmf_construct_chaninfo: Ignoring unexpected firmware channel 106

The above error does not seem to affect data collection, but I am not sure. Do you have some pointers on how to hunt for what might be the causes?

jlinktu commented 3 years ago

That is the brcmfmac driver complaining about a missing nvram config file named brcmfmac43455-sdio.raspberrypi,4-model-b.txt in /lib/firmware/brcm/. As you already figured it might be fine to just ignore it. If you want to fix it, you can try renaming /lib/firmware/brcm/brcmfmac43455-sdio.txt to /lib/firmware/brcm/brcmfmac43455-sdio.raspberrypi,4-model-b.txt.

maurovm commented 3 years ago

The file "/lib/firmware/brcm/brcmfmac43455-sdio.txt" looks to have important NVRAM flags. Reading the error logs and differences when I do and don't rename the file, it seems when it fails to find "brcmfmac43455-sdio.raspberrypi,4-model-b.txt" it retries with "brcmfmac43455-sdio.txt". Although I cannot verify this.

We are preparing to carry a lengthy research study. Do you have any pointers on how to begin debugging this?. One of my students will work on getting prepared the hardware to record data and I would like him to spend some time verifying this problem.

jlinktu commented 3 years ago

It seems you can find the correct brcmfmac43455-sdio.raspberrypi,4-model-b.txt file here: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/brcm.

If no nvram file is loaded the firmware has fallback values that it uses instead.

maurovm commented 3 years ago

Thanks Jakob for all your help, this really helps. I am now closing this issue