Closed simonvanderveldt closed 5 years ago
One observation is that there are several drivers one can use for the chipset used in the usb nub (Realtek 8192cu).
rtl8192cu
: The upstream kernel driver, mac80211 based. See https://wireless.wiki.kernel.org/en/users/drivers/rtl819x for more info.8192cu
: A driver from realtek themselves containing basically a modified copy of kernel infrastructure code from a long time ago which seems to have gone without updates from realtek for quiet a while. This is one has been added to the Raspberry Linux repo and is used by default on Raspberry Pis. It is being updated by contributors to the Raspberry Linux repo.
All of the three drivers listed here are included in their kernel config but the rtl8192cu
and the rtl8xxx
drivers are blacklisted via /etc/modprobe.d/blacklist-rtl8192cu.conf
and /etc/modprobe.d/blacklist-rtl8xxxu.conf
.
Note that this driver is enabled using the same kernel CONFIG_ flag af the above rtl8192cu
driver :xrtl8xxxu
: Not really sure about this one, I think this is eventually(?) supposed to replace a lot of drivers for realtek devices but it doesn't really seem to be used a lot/ready yet https://wireless.wiki.kernel.org/en/users/drivers/rtl819x
For reference: In the readme of the repo that's currently the main place where realtek's 8192cu
driver is mainted (https://github.com/pvaret/rtl8192cu-fixes) the suggestion is to the rtl8xxxu
driver over the 8192cu
driver.I'd normally choose to use the upstream kernel driver (rtl8192cu
) but there has been some older/ongoing discussion on the Raspberry Pi kernel about which to use with some people reporting issues with the rtl8192cu
driver. See https://github.com/raspberrypi/linux/pull/1468 and https://github.com/raspberrypi/linux/pull/1525 for examples.
I don't really care about the LED, but the issues mentioned here would at the very least need some decent testing to check if they have been fixed in the meantime https://github.com/raspberrypi/linux/pull/1525#issuecomment-225876599
It has received some commits since that report https://github.com/raspberrypi/linux/commits/rpi-4.19.y/drivers/net/wireless/realtek/rtlwifi/rtl8192cu
Since we haven't actively changed anything ourselves we're currently using the 8192cu
driver.
One thing I've observed is that in cases where the wifi didn't work after a restart/boot was that for currently unknown reasons the wireless connection gets soft blocked by rfkill
we@norns:~$ rfkill list
2: phy2: Wireless LAN
Soft blocked: yes
Hard blocked: no
Unblocking rfkill unblock all
and enabling the device using NetworkManager nmcli radio wifi on
makes it work again.
Also of interest is the systemd-rfkill.service
systemd service which saves the rfkill state on state change and restores it on boot. Not sure if this fails to restore it in the failure cases or if something else is broken.
I also tried rfkill.default_state=1
as kernel command line option, this should default all radios to unblocked (see https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt) but that didn't seem to do anything. Have to investigate that a bit more.
Just dropping some notes here, will investigate more tomorrow:
One thing that seems of is the states as written by systemd-rfkill.service
, this might or might now at least partially explain why different USB ports sometimes give different results
we@norns:~$ cat /var/lib/systemd/rfkill/platform-3f980000.usb-usb-0\:1.1\:1.0\:wlan
0
we@norns:~$ cat /var/lib/systemd/rfkill/platform-3f980000.usb-usb-0\:1.2\:1.0\:wlan
0
we@norns:~$ cat /var/lib/systemd/rfkill/platform-3f980000.usb-usb-0\:1.3\:1.0\:wlan
1
we@norns:~$ cat /var/lib/systemd/rfkill/platform-3f980000.usb-usb-0\:1.4\:1.0\:wlan
1
These files are used by the systemd-rfkill
service to restore the rfkill state.
1.2 is the USB port next to the UART on my prototype PCB, 1.3 the middle one and 1.4 the USB port next to the audio jacks.
Interestingly enough in all 3 ports it's marked as soft blocked after startup.
On another layer/item of the whole networking stack NetworkManager also has a state file and that also has wireless disabled
we@norns:~$ cat /var/lib/NetworkManager/NetworkManager.state
[main]
NetworkingEnabled=true
WirelessEnabled=false
WWANEnabled=true
All of this is after a clean/first boot using the 190303 (2.0 beta) image.
OK, some low-level details just to understand the whole stack:
rfkill
starts off with all radios unblocked by default because the rfkill.default_state
module parameter defaults to 1
in the code. 1
means unblocked
rfkill.default_state= 0 "airplane mode". All wifi, bluetooth, wimax, gps, fm, etc. communication is blocked by default. 1 Unblocked.
On init this value is then used to initialize all rfkill blocks to unblocked.
Once systemd starts it will also start systemd-rfkill
(via socket activiation using systemd-rfkill.socket
). As mentioned before this restores the rfkill state of individual devices from /var/lib/systemd/rfkill
.
If the last state before shutdown was blocked it will be restored to blocked, if it was unblocked it will be restored to unblocked.
NetworkManager pretty much ignores the softblock and purely acts based on it's desired state for the wifi. The state that's used on boot is the state as defined in /var/lib/NetworkManager/NetworkManager.state
, with the WirelessEnabled
field being the relevant one. It's state (true
for on, false
for off) will make NetworkManager change the rfkill state as well. Example log with WirelessEnabled=false
:
we@norns:~$ journalctl | grep -i kill
Mar 08 23:14:54 norns systemd[1]: Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.
Mar 08 23:14:55 norns systemd[1]: Starting Load/Save RF Kill Switch Status...
Mar 08 23:14:55 norns systemd[1]: Started Load/Save RF Kill Switch Status.
Mar 08 23:14:56 norns NetworkManager[324]: <info> [1552061696.5589] rfkill0: found WiFi radio killswitch (at /sys/devices/platform/soc/3f980000.usb/usb1/1-1/1-1.4/1-1.4:1.0/ieee80211/phy0/rfkill0) (driver rtl8192cu)
Mar 08 23:14:57 norns NetworkManager[324]: <info> [1552061697.1539] manager: WiFi enabled by radio killswitch; disabled by state file
Mar 08 23:14:57 norns NetworkManager[324]: <info> [1552061697.1542] manager: WWAN enabled by radio killswitch; enabled by state file
Mar 08 23:14:57 norns NetworkManager[324]: <info> [1552061697.3180] manager: WiFi now disabled by radio killswitch
So we start with wifi being enabled by rfkill but NetworkManager then turns it off/soft blocks it because /var/lib/NetworkManager/NetworkManager.state
says WirelessEnabled=false
.
This desired state can be changed using for example nmcli radio wifi on
. This will unblock the rfkill soft block and set WirelessEnabled=true
in /var/lib/NetworkManager/NetworkManager.state
.
So I've been trying the wifi connection using the (current default) 8192cu
driver through a multitude of reboots and shutdowns and the wifi connection has been working 100% of the time for me.
The only thing I changed vs the stock 190303 image was setting WirelessEnabled=true
in /var/lib/NetworkManager/NetworkManager.state
whilst NetworkManager wasn't running (otherwise it will overwrite the state file again).
After that I rebooted and didn't touch NetworkManager anymore.
This action can probably be replaced by running nmcli radio wifi on
instead.
I'll try several cycles of disconnecting and connecting as well as disabling wifi next.
fantastic.
what would be the best way to update the image with nmcli radio wifi on
?
what would be the best way to update the image with
nmcli radio wifi on
?
Once I know for sure it's working 100% of the time I'll make a PR (if necessary) to make the required changes.
Just to be clear, the change being discussed here w.r.t keeping the radios enabled is unrelated to whether the Realtek nub works in HOTSPOT/AP mode correct?
Just to be clear, the change being discussed here w.r.t keeping the radios enabled is unrelated to whether the Realtek nub works in HOTSPOT/AP mode correct?
Correct, this is just about regulier client (station) wifi usage, not hotspot (AP).
So I can 100% reliably get a connection going by starting from the latest image (190314, although it also worked fine on the older ones) running nmcli radio wifi on
and then connecting via nmtui. Then on shutdown via
sleepwifi gets turned off and when I start again wifi is still off, I go to
system > wifi`, it's takes a bit of time and then wifi is up and running again.
This seems to cover at least the very basic use case of being able to connect to wifi, which on my device with my router (you never know with wifi stuff) works reliably from an os perspective.
I haven't tried this whole workflow via the screen/matron, will do that tomorrow.
Now, unfortunately I can also reliably break the wifi driver by simply running nmcli radio wifi off
, this results in the following stack trace
Mar 16 08:10:18 norns avahi-daemon[287]: Interface wlan0.IPv6 no longer relevant for mDNS.
Mar 16 08:10:18 norns kernel: ------------[ cut here ]------------
Mar 16 08:10:18 norns kernel: WARNING: CPU: 2 PID: 61 at net/wireless/sme.c:981 __cfg80211_disconnected+0x1bc/0x1d4 [cfg80211]
Mar 16 08:10:18 norns kernel: Modules linked in: 8192cu cfg80211 rfkill evdev rotary_encoder bcm2835_gpiomem uio_pdrv_genirq uio gpio_keys fixed ip_tables x_tables ipv6
Mar 16 08:10:18 norns kernel: CPU: 2 PID: 61 Comm: kworker/u8:1 Not tainted 4.14.52-13-g5b46619 #1
Mar 16 08:10:18 norns kernel: Hardware name: BCM2835
Mar 16 08:10:18 norns kernel: Workqueue: cfg80211 cfg80211_event_work [cfg80211]
Mar 16 08:10:18 norns kernel: [<8010ee7c>] (unwind_backtrace) from [<8010b6e4>] (show_stack+0x10/0x14)
Mar 16 08:10:18 norns kernel: [<8010b6e4>] (show_stack) from [<8066c298>] (dump_stack+0x88/0xa8)
Mar 16 08:10:18 norns kernel: [<8066c298>] (dump_stack) from [<80118dd4>] (__warn+0xe8/0x100)
Mar 16 08:10:18 norns kernel: [<80118dd4>] (__warn) from [<80118e9c>] (warn_slowpath_null+0x20/0x28)
Mar 16 08:10:18 norns kernel: [<80118e9c>] (warn_slowpath_null) from [<7f1b4560>] (__cfg80211_disconnected+0x1bc/0x1d4 [cfg80211])
Mar 16 08:10:18 norns kernel: [<7f1b4560>] (__cfg80211_disconnected [cfg80211]) from [<7f1922ac>] (cfg80211_process_wdev_events+0x134/0x13c [cfg80211])
Mar 16 08:10:18 norns kernel: [<7f1922ac>] (cfg80211_process_wdev_events [cfg80211]) from [<7f1922e0>] (cfg80211_process_rdev_events+0x2c/0x60 [cfg80211])
Mar 16 08:10:18 norns kernel: [<7f1922e0>] (cfg80211_process_rdev_events [cfg80211]) from [<7f18d164>] (cfg80211_event_work+0x14/0x1c [cfg80211])
Mar 16 08:10:18 norns kernel: [<7f18d164>] (cfg80211_event_work [cfg80211]) from [<8012f7e0>] (process_one_work+0x1d8/0x414)
Mar 16 08:10:18 norns kernel: [<8012f7e0>] (process_one_work) from [<8013052c>] (worker_thread+0x4c/0x598)
Mar 16 08:10:18 norns kernel: [<8013052c>] (worker_thread) from [<801352e4>] (kthread+0x14c/0x154)
Mar 16 08:10:18 norns kernel: [<801352e4>] (kthread) from [<80107bc8>] (ret_from_fork+0x14/0x2c)
Mar 16 08:10:18 norns kernel: ---[ end trace f30ffe7b67b6b1e0 ]---
@artfwo has tested the same thing and get the same behavior. After this the wifi nub/driver is no longer functional, no connection can be made. The main message related to this is from wpa_supplicant:
norns wpa_supplicant[376]: wlan0: Association request to the driver failed
I'm going to experiment with some combinations of actions to see if we can turn off wifi without issues and maybe try a trace of the wifi driver to see if I can figure out what's going wrong. At the moment I'm not sure we can solve this. If not we can either simply not have the option to turn the wifi off or we/I could try a different driver. Btw after a restart everything is back to working properly again as described above.
Next update: A simple nmcli connection down
is already a enough to get into a non-functional state. There's no kernel stack trace but the result is the same, no longer able to connect with the same message
wpa_supplicant[375]: wlan0: Association request to the driver failed
Initially I thought it was some incompatibility of the 8192cu
driver with NetworkManager but I can't even get it to work when using wpa_supplicant
directly. I.e.
# disable the NetworkManager service
# wpa_supplicant service
# configure `/etc/wpa_supplicant/wpa_supplicant.conf` with SSID and credentials
# Now connect using wpa_supplicant
sudo wpa_supplicant -B -i wlan0 -f /var/log/wpa_supplicant.log -c /etc/wpa_supplicant/wpa_supplicant.conf
# Get a lease
sudo dhclient -v wlan0
# Now we have a working connecting
# Kill wpa_supplicant
killall wpa_supplicant
# Try to connect again
sudo wpa_supplicant -B -i wlan0 -f /var/log/wpa_supplicant.log -c
# And we get in the logs
cat /var/log/wpa_supplicant.log
...
wlan0: Association request to the driver failed
...
I think I'm going to call this part of the investigation a day and conclude that the 8192cu
driver is a bunch of crap. We can keep using it but then we should remove the option to turn off the wifi from the menu.
I'll do a quick try of the other two drivers to see if they work better.
shame about the driver.
for our purposes (and general user experience) i'm concerned primarily with getting it turned on.
i agree that the OFF option can be removed.
reiterating here that if people don't want WIFI, remove the nub!
Quick update, I tried out the rtl8xxxu
driver (which needs CONFIG_RTL8XXXU_UNTESTED=y
enabled in the kernel config because it's untested).
Unfortunately it doesn't seem to work. The firmware is there and gets loaded but it then throws an error that it failed to write some block (coming from this write_error
function)
Mar 17 23:09:25 norns mtp-probe[231]: checking bus 1, device 3: "/sys/devices/platform/soc/3f980000.usb/usb1/1-1/1-1.2"
Mar 17 23:09:25 norns mtp-probe[231]: bus: 1, device: 3 was not an MTP device
Mar 17 23:09:25 norns systemd[1]: Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.
Mar 17 23:09:25 norns kernel: usb 1-1.2: Vendor: Realtek
Mar 17 23:09:25 norns kernel: usb 1-1.2: Product: 802.11n WLAN Adapter
Mar 17 23:09:25 norns kernel: usb 1-1.2: rtl8192cu_parse_efuse: dumping efuse (0x80 bytes):
Mar 17 23:09:25 norns kernel: usb 1-1.2: 00: 29 81 00 74 cd 00 00 00
Mar 17 23:09:25 norns kernel: usb 1-1.2: 08: ff 00 da 0b 76 81 03 41
Mar 17 23:09:25 norns kernel: usb 1-1.2: 10: 32 00 85 62 9e ad 40 a5
Mar 17 23:09:25 norns kernel: usb 1-1.2: 18: ef 0e e7 7e 0a 03 52 65
Mar 17 23:09:25 norns kernel: usb 1-1.2: 20: 61 6c 74 65 6b 00 16 03
Mar 17 23:09:25 norns kernel: usb 1-1.2: 28: 38 30 32 2e 31 31 6e 20
Mar 17 23:09:25 norns kernel: usb 1-1.2: 30: 57 4c 41 4e 20 41 64 61
Mar 17 23:09:25 norns kernel: usb 1-1.2: 38: 70 74 65 72 00 00 00 00
Mar 17 23:09:25 norns kernel: usb 1-1.2: 40: 00 00 00 00 00 00 00 00
Mar 17 23:09:25 norns kernel: usb 1-1.2: 48: 00 00 00 00 00 00 00 00
Mar 17 23:09:25 norns kernel: usb 1-1.2: 50: 01 00 00 00 00 00 00 00
Mar 17 23:09:25 norns kernel: usb 1-1.2: 58: 06 00 22 22 22 00 00 00
Mar 17 23:09:25 norns kernel: usb 1-1.2: 60: 22 22 22 00 00 00 00 00
Mar 17 23:09:25 norns kernel: usb 1-1.2: 68: 00 00 00 00 00 00 00 00
Mar 17 23:09:25 norns kernel: usb 1-1.2: 70: 00 00 00 00 00 0a 00 00
Mar 17 23:09:25 norns kernel: usb 1-1.2: 78: 10 00 00 10 36 00 81 00
Mar 17 23:09:25 norns kernel: usb 1-1.2: RTL8188CU rev A (TSMC) 1T1R, TX queues 2, WiFi=1, BT=0, GPS=0, HI PA=0
Mar 17 23:09:25 norns kernel: usb 1-1.2: RTL8188CU MAC: 40:a5:ef:0e:e7:7e
Mar 17 23:09:25 norns kernel: usb 1-1.2: rtl8xxxu: Loading firmware rtlwifi/rtl8192cufw_TMSC.bin
Mar 17 23:09:25 norns kernel: usb 1-1.2: Firmware revision 80.0 (signature 0x88c1)
Mar 17 23:09:25 norns kernel: usb 1-1.2: rtl8xxxu_writeN: Failed to write block at addr: 1280 size: 0080
...
Mar 17 23:09:26 norns kernel: rtl8xxxu: probe of 1-1.2:1.0 failed with error -11
Mar 17 23:09:26 norns kernel: usbcore: registered new interface driver rtl8xxxu
After this whilst the driver is loaded no device is actually registered
I don't think we can expect any changes with a newer kernel either since the changes between the current 4.19 raspberry kernel and ours are pretty limited https://github.com/raspberrypi/linux/commits/d602de8e7e7fc25fb3a2112ce4285962f15aa549/drivers/net/wireless/realtek/rtl8xxxu, 4 commits that mainly fix typos.
So I guess the rtl8xxxu driver is out.
OK, some good news. The rtl8192cu
driver seems to work pretty well from the network management perspective. Just went through a cycle of enabling/disabling the wifi 10 times and it worked perfectly every time.
Also did a quick hotspot test and that also worked first time + I could easily switch back and forth between client/station and hotspot mode :)
we@norns:~$ nmcli con
NAME UUID TYPE DEVICE
<my SSID> 59d5b2f9-0c60-4340-aa4a-07ab24f55c41 802-11-wireless wlan0
Hotspot 71ad3595-2448-4cf3-8d55-a4263c28ab3e 802-11-wireless --
we@norns:~$ nmcli con down 59d5b2f9-0c60-4340-aa4a-07ab24f55c41
Connection '<my SSID>' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/16)
we@norns:~$ nmcli con up 71ad3595-2448-4cf3-8d55-a4263c28ab3e
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/17)
we@norns:~$ nmcli con
NAME UUID TYPE DEVICE
Hotspot 71ad3595-2448-4cf3-8d55-a4263c28ab3e 802-11-wireless wlan0
<my SSID> 59d5b2f9-0c60-4340-aa4a-07ab24f55c41 802-11-wireless --
[edit] I also went through a plug/unplug cycle back and forth through all USB ports whilst norns was connected and in all cases the connection recovered within seconds of plugging in the nub.
Now there are still these old reports of issues with this driver https://github.com/raspberrypi/linux/pull/1525#issuecomment-225876599, not sure if they are still relevant and also not 100% sure how to determine if they are.
Do you have a recipe/kernel build I can test w/ the rtl8192cu
driver? If so I can probably test it over the next day or so and confirm your findings.
Do you have a recipe/kernel build I can test w/ the
rtl8192cu
driver? If so I can probably test it over the next day or so and confirm you findings.
@ngwese If you want to try this: As long as you're on any of the 2.0 beta images/updates you should be good to test this, no need for a different kernel since the driver is already included in the current kernel. It's just blacklisted at the moment :)
So there's one change required, edit /etc/modprobe.d/blacklist-rtl8192cu.conf
and replace blacklist rtl8192cu
with blacklist 8192cu
and then shutdown and start norns again (also make sure the norns wifi nub is in a USB port of course :wink: )
It should now be using this different driver for the wifi nub, check that rtl8192cu
is loaded via lsmod
:
we@norns:~$ lsmod
Module Size Used by
arc4 16384 2
rtl8192cu 86016 0
rtl_usb 20480 1 rtl8192cu
rtl8192c_common 61440 1 rtl8192cu
rtlwifi 90112 3 rtl_usb,rtl8192c_common,rtl8192cu
mac80211 421888 3 rtl_usb,rtlwifi,rtl8192cu
cfg80211 274432 2 mac80211,rtlwifi
rfkill 24576 2 cfg80211
evdev 24576 4
rotary_encoder 16384 0
bcm2835_gpiomem 16384 0
gpio_keys 20480 0
uio_pdrv_genirq 16384 0
uio 16384 1 uio_pdrv_genirq
fixed 16384 0
ip_tables 24576 0
x_tables 28672 1 ip_tables
ipv6 401408 24
Now you should be able to connect either via system > wifi
or using nmtui
via SSH/serial console.
It could be you first not to bring up the radio in NetworkManager via nmcli radio wifi on
.
Initially I'm just looking for feedback on normal usage where norns connects to your accesspoint/router. If that works well I'd like to check hotspot as well.
Just ran a 4 hour test using the rtl8192cu
driver doing a simple ping (not the best test ever, but alas) and no packet loss at all :)
Quiet a bit of difference in the response times, but not sure if that's related to the wifi nub or wifi in general.
--- www.google.nl ping statistics ---
14634 packets transmitted, 14626 received, 0% packet loss, time 14653627ms
rtt min/avg/max/mdev = 9.015/17.005/643.364/21.181 ms
I think we can close this for now. If issues pop up with the new driver we can create a new issue and determine where to go from there.
This issue is about the problems with the included wifi nub, purely on an OS level. The integration with matron/the menu is out of scope. Same for hotspot related issues.
We've been having quiet a lot of reports from users and experiences ourselves with the wifi being unreliable. We should figure out why that is and try to find a solution to make it work reliably.
Symptoms seem a bit random, for example the network connection works fine (
ping google.com
no drops), reboot, works fine, reboot again, works fine, reboot another time, it doesn't work anymore.