Open leezu opened 1 year ago
Please post the full output of sudo lsusb -v
.
This issue reproduces on a fresh boot for which only the CF-953AX adapter was connected:
[89918.453283] mt7921u 2-1:1.3: Message 00020003 (seq 1) timeout
[89918.453343] wlxe0e1a934a6a9: failed to remove key (0, 2c:71:ff:8e:bd:7f) from hardware (-110)
[89918.741304] mt7921u 2-1:1.3: timed out waiting for pending tx
[89918.758984] ------------[ cut here ]------------
[89918.759003] WARNING: CPU: 0 PID: 7829 at kernel/kthread.c:659 kthread_park+0xb4/0xd0
[89918.759037] Modules linked in: ctr aes_arm64 aes_generic ccm xt_MASQUERADE iptable_nat xt_mark nft_chain_nat ip6table_nat nf_nat tun mt7921u mt7921_common mt76_connac_lib mt76_usb btusb mt76 btrtl mac80211 btintel btbcm bluetooth ecdh_generic vc4 ecc libarc4 libaes snd_soc_hdmi_codec drm_display_helper brcmfmac cec drm_cma_helper brcmutil bcm2835_codec(C) rpivid_hevc(C) drm_kms_helper bcm2835_isp(C) v3d cfg80211 v4l2_mem2mem bcm2835_v4l2(C) snd_soc_core gpu_sched bcm2835_mmal_vchiq(C) drm_shmem_helper videobuf2_dma_contig videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev snd_bcm2835(C) snd_compress raspberrypi_hwmon snd_pcm_dmaengine rfkill snd_pcm vc_sm_cma(C) snd_timer mc snd syscopyarea sysfillrect sysimgblt uio_pdrv_genirq fb_sys_fops nvmem_rmem uio ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_comment xt_multiport nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat
[89918.759451] nf_tables nfnetlink drm fuse drm_panel_orientation_quirks backlight ip_tables x_tables ipv6
[89918.759502] CPU: 0 PID: 7829 Comm: kworker/u8:2 Tainted: G C 6.0.0-rc7-v8+ #4
[89918.759518] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[89918.759528] Workqueue: mt76 mt7921_mac_reset_work [mt7921_common]
[89918.759579] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[89918.759592] pc : kthread_park+0xb4/0xd0
[89918.759604] lr : mt76u_stop_tx+0x278/0x330 [mt76_usb]
[89918.759632] sp : ffffffc008f63c50
[89918.759639] x29: ffffffc008f63c50 x28: 0000000000000000 x27: ffffff8049818848
[89918.759662] x26: 0000000000000000 x25: ffffff8042e26c80 x24: ffffff8049812068
[89918.759684] x23: ffffff8049814820 x22: ffffff8049816020 x21: ffffff8049812048
[89918.759705] x20: ffffff8048bf5880 x19: ffffff8043cf5d00 x18: 0000000000000000
[89918.759725] x17: 0000000000000001 x16: ffffffd2e20b4e40 x15: 000efb46b4eb6234
[89918.759745] x14: 001163b91bdc1648 x13: 00000000000003dd x12: 00000000fa83b2da
[89918.759766] x11: 00000000000003dd x10: 0000000000001a90 x9 : ffffffd2d63fe8a8
[89918.759786] x8 : ffffff8040b8f7f0 x7 : 0000000000000001 x6 : ffffffd2e350b0c0
[89918.759806] x5 : ffffffd2e33a9000 x4 : ffffffd2e33a90b0 x3 : 0000000000002800
[89918.759825] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000004
[89918.759845] Call trace:
[89918.759851] kthread_park+0xb4/0xd0
[89918.759863] mt76u_stop_tx+0x278/0x330 [mt76_usb]
[89918.759887] mt7921u_mac_reset+0x88/0x2d8 [mt7921u]
[89918.759906] mt7921_mac_reset_work+0xac/0x1a0 [mt7921_common]
[89918.759938] process_one_work+0x1dc/0x450
[89918.759953] worker_thread+0x154/0x450
[89918.759965] kthread+0x104/0x110
[89918.759975] ret_from_fork+0x10/0x20
[89918.759990] ---[ end trace 0000000000000000 ]---
[89918.898590] mt7921u 2-1:1.3: HW/SW Version: 0x8a108a10, Build Time: 20220908210919a
[89918.910901] mt7921u 2-1:1.3: WM Firmware Version: ____010000, Build Time: 20220908211021
[89927.669357] mt7921u 2-1:1.3: Message 00020003 (seq 10) timeout
[89927.669421] wlxe0e1a934a6a9: failed to remove key (0, 8c:fd:f0:42:20:cf) from hardware (-110)
[89929.993368] mt7921u 2-1:1.3: timed out waiting for pending tx
[89930.133974] mt7921u 2-1:1.3: HW/SW Version: 0x8a108a10, Build Time: 20220908210919a
[89930.146118] mt7921u 2-1:1.3: WM Firmware Version: ____010000, Build Time: 20220908211021
[89935.349382] mt7921u 2-1:1.3: Message 00020001 (seq 12) timeout
A suggestion, maybe ping morrownr, here on Github, he has extensive experience with some of the mt7921 based devices.
Your listing has bMaxBurst=0 for all of the endpoints in any of the configurations / alternate settings. So the burst OUT bug, and the corresponding fix, doesn't apply.
If the crash happens only under heavy load, then it's possible the wifi adapter momentarily exceeds the 1.2A downstream port current limit. Does it still happen if plugged into a self-powered USB3.0 hub?
I have the same crash happening even if plugged into a self-powered USB3.0 hub on two different Pi 4 running latest 6.1.21-v8+ kernel and 64-bit Raspbian. This happens when both plugged on USB3 or USB2 port and doesn't need the adapter to be in AP mode.
Same thing happens on an Orange Pi 5 Plus with Openwrt running in a Proxmox VM:
[ 332.940838] mt7921u 4-1:1.0: Message 00020003 (seq 7) timeout [ 332.943895] phy0-ap0: failed to set key (1, ff:ff:ff:ff:ff:ff) to hardware (-110) [ 335.980509] mt7921u 4-1:1.0: Message 00020003 (seq 8) timeout [ 335.991275] phy0-ap0: failed to set key (4, ff:ff:ff:ff:ff:ff) to hardware (-110) [ 339.235851] mt7921u 4-1:1.0: vendor request req:63 off:d02c failed:-110 [ 342.470445] mt7921u 4-1:1.0: vendor request req:63 off:d054 failed:-110 [ 345.672157] mt7921u 4-1:1.0: vendor request req:63 off:d058 failed:-110 [ 348.890817] mt7921u 4-1:1.0: vendor request req:63 off:53b8 failed:-110 [ 352.121498] mt7921u 4-1:1.0: vendor request req:63 off:53c4 failed:-110 [ 355.362003] mt7921u 4-1:1.0: vendor request req:66 off:53c4 failed:-110 [ 358.610738] mt7921u 4-1:1.0: vendor request req:63 off:d698 failed:-110 [ 361.825873] mt7921u 4-1:1.0: vendor request req:63 off:d520 failed:-110 [ 365.031865] mt7921u 4-1:1.0: vendor request req:63 off:d518 failed:-110 [ 368.274826] mt7921u 4-1:1.0: vendor request req:63 off:d688 failed:-110 [ 371.482260] mt7921u 4-1:1.0: vendor request req:63 off:d690 failed:-110 [ 374.698523] mt7921u 4-1:1.0: vendor request req:63 off:d558 failed:-110 [ 377.916391] mt7921u 4-1:1.0: vendor request req:63 off:d564 failed:-110 [ 381.142789] mt7921u 4-1:1.0: vendor request req:63 off:d568 failed:-110 [ 384.365935] mt7921u 4-1:1.0: vendor request req:63 off:d7a8 failed:-110 [ 387.582269] mt7921u 4-1:1.0: vendor request req:63 off:a150 failed:-110 [ 390.820761] mt7921u 4-1:1.0: vendor request req:63 off:a158 failed:-110 [ 394.036455] mt7921u 4-1:1.0: vendor request req:63 off:d780 failed:-110 [ 397.265631] mt7921u 4-1:1.0: vendor request req:63 off:d770 failed:-110 [ 400.508192] mt7921u 4-1:1.0: vendor request req:63 off:d774 failed:-110 [ 403.721153] mt7921u 4-1:1.0: vendor request req:63 off:d55c failed:-110 [ 406.961058] mt7921u 4-1:1.0: vendor request req:63 off:10e0 failed:-110 [ 410.185741] mt7921u 4-1:1.0: vendor request req:63 off:10e4 failed:-110 [ 413.441741] mt7921u 4-1:1.0: vendor request req:63 off:10e8 failed:-110 [ 416.689423] mt7921u 4-1:1.0: vendor request req:63 off:10ec failed:-110 [ 419.912375] mt7921u 4-1:1.0: vendor request req:63 off:10f0 failed:-110 [ 423.172345] mt7921u 4-1:1.0: vendor request req:63 off:10f4 failed:-110 [ 426.412783] mt7921u 4-1:1.0: vendor request req:63 off:10f8 failed:-110 [ 429.631078] mt7921u 4-1:1.0: vendor request req:63 off:10fc failed:-110 [ 432.862377] mt7921u 4-1:1.0: vendor request req:63 off:d7dc failed:-110 [ 436.082556] mt7921u 4-1:1.0: vendor request req:63 off:d7ec failed:-110 [ 439.292609] mt7921u 4-1:1.0: vendor request req:63 off:d7e0 failed:-110 [ 442.543617] mt7921u 4-1:1.0: vendor request req:63 off:d7f0 failed:-110 [ 445.773236] mt7921u 4-1:1.0: vendor request req:63 off:d7e4 failed:-110 [ 448.990723] mt7921u 4-1:1.0: vendor request req:63 off:d7f4 failed:-110 [ 452.214152] mt7921u 4-1:1.0: vendor request req:63 off:d7e8 failed:-110 [ 455.453004] mt7921u 4-1:1.0: vendor request req:63 off:d7f8 failed:-110 [ 458.531282] mt7921u 4-1:1.0: Message 00020002 (seq 9) timeout [ 461.571373] mt7921u 4-1:1.0: Message 00020002 (seq 10) timeout [ 461.784163] mt7921u 4-1:1.0: HW/SW Version: 0x8a108a10, Build Time: 20230331110902a [ 461.784163] [ 461.811386] mt7921u 4-1:1.0: WM Firmware Version: ____010000, Build Time: 20230331110939 [ 464.565793] IPv6: ADDRCONF(NETDEV_CHANGE): phy0-ap0: link becomes ready [ 464.569712] br-lan: port 2(phy0-ap0) entered blocking state [ 464.576405] br-lan: port 2(phy0-ap0) entered forwarding state
root@OpenWrt:/# uname -a Linux OpenWrt 6.1.35 #0 SMP Fri Jun 23 21:07:17 2023 aarch64 GNU/Linux
Describe the bug
With USB Wifi devices including the CF-953AX (mt7921au) random hangs can occur when used with RPi. The hangs occur under heavy load, for example when operating the device as access point and running a speedtest + in parallel downloading many small files on a client [1]. This suggests further unmitigated firmware bugs in the the RPi USB Host Controller VL805. Disabling USB scatter gather via
/sys/module/mt76_usb/parameters/disable_usb_sg
appears to reduce the frequency of the issue, but the issue can still be reproduced with scatter gather disabled.The issue has also been discussed at https://github.com/openwrt/mt76/issues/405 (for mt7612u devices) and https://github.com/morrownr/USB-WiFi/issues/107#issuecomment-1260976813 (for mt7921au devices)
Steps to reproduce the behaviour
ping -i 0.2 google.com
on the client.sudo systemctl restart hostapd@wlxe0e1a934a6a9
)Optional: Disable USB Scatter Gather for the Mediatek USB Adapters. (The error can still be reproduced, but appear to occur less frequently)
Device (s)
Raspberry Pi 4 Mod. B
System
Raspberry Pi reference 2021-05-07 Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, dcfd74d7d1fa293065ac6d565711e9ff891fe2b8, stage2
Firmware: Aug 26 2022 14:03:16 Copyright (c) 2012 Broadcom version 102f1e848393c2112206fadffaaf86db04e98326 (clean) (release) (start)
Kernel: Linux raspberrypi 6.0.0-rc7-v8+ #4 SMP PREEMPT Wed Sep 28 02:12:59 UTC 2022 aarch64 GNU/Linux built from rpi-6.0.y https://github.com/raspberrypi/linux/commit/3fb5ca89c844eb9fda924a5e76ffab9e8c068675 which includes the https://github.com/raspberrypi/linux/pull/5173 USB fixes. (This issue also happened with earlier kernels that did not include the #5173 fixes)
Logs
Additional context
At https://github.com/raspberrypi/linux/pull/5173#discussion_r974429018 @dobo90 suggested that the issue may be addressed by https://github.com/raspberrypi/linux/commit/3157603c3925a30c6b26dc8f7a1f2c23f7b7bb55 That commit forces max_burst = 0 if a USB_CLASS_MASS_STORAGE device is present on a hub (which should include the root_hub, and thus all usb devices plugged into the RPi4?). I'm able to reproduce firmware crashes on CF-953AX on 6.0.0-rc7-v8+ which includes the current fix, both in the case when a USB_CLASS_MASS_STORAGE device is present during boot and remains present, as well as if no USB_CLASS_MASS_STORAGE device is present during boot or at any time.
Boot without a USB_CLASS_MASS_STORAGE (in which case the "fix" is disabled):
and with a USB_CLASS_MASS_STORAGE present (in which case the "fix" is enabled)
Those are both with
/sys/module/mt76_usb/parameters/disable_usb_sg=N
.[1]: Access Point Configuration