raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.14k stars 4.99k forks source link

[brcmfmac] connman/iwd unreliable rescan of SSIDs #3898

Open gearhead opened 4 years ago

gearhead commented 4 years ago

Is this the right place for my bug report? This repository contains the Linux kernel used on the Raspberry Pi. If you believe that the issue you are seeing is kernel-related, this is the right place. If not, we have other repositories for the GPU firmware at github.com/raspberrypi/firmware and Raspberry Pi userland applications at github.com/raspberrypi/userland. If you have problems with the Raspbian distribution packages, report them in the github.com/RPi-Distro/repo. If you simply have a question, then the Raspberry Pi forums are the best place to ask it.

This kind of covers both the kernel and firmware, but is mostly focused on the driver part.

Describe the bug When using connman and iwd on my Rpi, I sometimes get a re-connection when the SSID disappears then reappears (router reboot) but it is inconsistent.

To reproduce Build kernel with the iwd flags set https://iwd.wiki.kernel.org/gettingstarted install connman and iwd un-install wpa_supplicant!! Create a config file in /var/lib/connman for your local SSID https://www.mankier.com/5/connman-service.config connect to the SSID Reboot the router or turn off then on the radio on the router and see if the RPi reconnects.

Expected behaviour After the radio comes back online, it should connect pretty quickly

Actual behaviour Intermittent behavior. Sometimes it re-scans and connects, sometimes it just sits and never re-scans. If it re-scans, it will reconnect.

System Currently Arch Linux running Kernel 5.4.70. Am in process of building a separate card with Raspi OS Lite and building the kernel with the proper configs set and will test there as well as I believe that would be the preferred test platform

Logs If applicable, add the relevant output from dmesg or similar. Nothing shows in the log. It just doesn't always rescan. from discussion with connman and iwd developers, this may be a driver issue with the brcmfmac driver Additional context Add any other relevant context for the problem. There is a lot of discussion of the brcmfmac firmware and that different versions have different capabilities. I have hacked in the latest broadcom firmware onto my Pis from the Cypress update form 20200625:

Originally: brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4345/6 wl0: Feb 27 2018 03:15:32 version 7.45.154 (r684107 CY) FWID 01-4fbe0b04 Updated to: brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4345/6 wl0: Sep 18 2020 02:27:58 version 7.45.221 (3a6d3a0 CY) FWID 01-bbd9282b

It will actually rescan once, but then stalls and will not rescan. I wonder is if the firmware and driver from cypress may be needed to function correctly. The Cypress update package has the firmware and the driver. I have seen many discussions online about updating the firmware, but none about updating the driver. I spent a few hours trying to get this to build either as a module by itself or with the full kernel package and it fails every time. I am certain that I am not patching properly with all the files and configs it needs. https://community.cypress.com/docs/DOC-21490 The zip file has within it the firmware and also a 5.4.18 module backport. When I compare the files in the brcmfmac directory, there are a lot of differences to what is in the 5.70 kernel as well as the current upstream kernel. It is as if none of this has been incorporated at all in the kernel tree. I have attempted a patch with the files in the /drivers/net/broadcom/brcm80211/brcmfmac and other files from that tree and get to this and it stops:

$ make M=drivers/net/wireless/broadcom/brcm80211 CC [M] drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.o In file included from drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c:12: ./include/net/cfg80211.h: In function ‘wiphy_net’: ./include/net/cfg80211.h:4693:9: error: implicit declaration of function ‘possible_read_pnet’ [-Werror=implicit-function-declaration] 4693 | return possible_read_pnet(&wiphy->_net); | ^~~~~~ ./include/net/cfg80211.h:4693:9: warning: returning ‘int’ from a function with return type ‘struct net *’ makes pointer from integer without a cast [-Wint-conversion] 4693 | return possible_read_pnet(&wiphy->_net); | ^~~~~~~~ ./include/net/cfg80211.h: In function ‘wiphy_net_set’: ./include/net/cfg80211.h:4698:2: error: implicit declaration of function ‘possible_write_pnet’ [-Werror=implicit-function-declaration] 4698 | possible_write_pnet(&wiphy->_net, net); | ^~~~~~~ cc1: some warnings being treated as errors make[2]: [scripts/Makefile.build:266: drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.o] Error 1 make[1]: [scripts/Makefile.build:500: drivers/net/wireless/broadcom/brcm80211/brcmfmac] Error 2 make: *** [Makefile:1732: drivers/net/wireless/broadcom/brcm80211] Error 2

with a bit of help, I may get it to build and can see if it does anything differently.

gearhead commented 4 years ago

Just tested connman/iwd connectivity and response on a RaspiOS 64 bit image. Works flawlessly with iwd 0.14 and connman 1.36. Still, the brcmfmac driver is not 'up to date' with the current cypress source code and I still want to get that built. Any ideas on how to get the build to go any further?

tvanriper commented 1 year ago

I realize this is a couple of years late, but I hope you haven't given up hope on this. I'd like to see WiFi stabilized for the Pi.

To fix those implicit declaration of function errors, I'd check to see if ./include/net/cfg80211.h includes 'net_namespace.h', as that seems to be the header that provides the declaration for possible_write_pnet(), which should clear up the 'implicit-function-declaration' problem.

(Source: https://kernel.googlesource.com/pub/scm/linux/kernel/git/iwlwifi/backport-iwlwifi/+/release/LinuxCore12/backport-include/net/net_namespace.h)

If you can't find that header, you could perhaps copy the one I linked above and use it. But, to be clear, I'm completely guessing about this... it's a fairly educated guess, but it's still a guess. If the header is already included, something might be preventing the contents of header from reaching this bit of code (maybe an ifndef or ifdef filtering out the header or something). If you're trying to make this work with the kernel >= 5.x, likely you could get away with putting the following two lines at the top of this header:

#define possible_write_pnet(pnet, net) write_pnet(pnet, net) #define possible_read_pnet(pnet) read_pnet(pnet)

But, again, completely guessing. I'd try these myself if I had the environment already set up, as I'd really like to see this working.

Maybe, for the 'makes pointer from integer without a cast' warning, you could change:

| return possible_read_pnet(&wiphy->_net);

with:

| return (int) possible_read_pnet(&wiphy->_net);

But, I'd want to know why it's returning an int there instead of a pointer, to be certain this isn't going to lead to other problems. If the int is used primarily to determine if the function worked (returns 0, or returns anything), then it would make sense.