aircrack-ng / rtl8812au

RTL8812AU/21AU and RTL8814AU driver with monitor mode and frame injection
GNU General Public License v2.0
3.56k stars 781 forks source link

Kernel module freeze & crash #875

Open quantatic opened 3 years ago

quantatic commented 3 years ago

I'm installing this driver via https://aur.archlinux.org/packages/rtl88xxau-aircrack-dkms-git, which appears to be installing the v5.6.4.2 branch, commit b8167e66b4ac046b3b76c2c40008d84528e91594.

When changing certain settings related to the network interface (for instance, sudo airmon-ng start wlp5s0f3u3 results in a consistently reproducible error case), I get a variety of kernel errors via dmesg, many of which result in full kernel lock-up (system calls and any other kernel-related API calls result in a permanent, uninterruptible, hang). I have tried two different WIFI USB devices (both with the same chipset), and both consistently produce the same error.

This issue is consistently reproducible on Arch Linux 5.12.14, and consistently results in full kernel lock-up, requiring a hard reboot to resolve. I've also tested using the LTS kernel: Linux archlinux 5.10.47-1-lts #1 SMP Wed, 30 Jun 2021 13:52:19 +0000 x86_64 GNU/Linux, and I also get a plethora of kernel errors, though it seems these errors are (at least not immediately) fatal, as the 5.12.14 kernel errors appear to be. I've attached on such error below.

This seems like it may be related to #498, though one comment in that thread claims that that issue was fixed.

Any advice for placed to start looking into the issue would be much appreciated. I'm happy to help track this issue down and resolve it, though I can't say I have too much kernel driver development experience :smiley:

[ 1077.525944] ------------[ cut here ]------------
[ 1077.525948] kernel BUG at mm/slub.c:305!
[ 1077.525957] invalid opcode: 0000 [#3] SMP NOPTI
[ 1077.525959] CPU: 6 PID: 4436 Comm: iw Tainted: P      D    OE     5.10.47-1-lts #1
[ 1077.525960] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Phantom Gaming 4, BIOS P3.60 12/01/2020
[ 1077.525965] RIP: 0010:__slab_free+0x213/0x430
[ 1077.525967] Code: 44 24 20 e8 0f fc ff ff 44 8b 44 24 20 85 c0 0f 85 3f fe ff ff eb b2 41 f7 46 08 00 0d 21 00 0f 85 1f ff ff ff e9 11 ff ff ff <0f> 0b 80 4c 24 5b 80 45 31 c9 e9 7e fe ff ff f3 90 49 8b 04 24 a8
[ 1077.525968] RSP: 0018:ffffb899c562f8d0 EFLAGS: 00010246
[ 1077.525970] RAX: ffff8c004ad75b00 RBX: 000000008020001a RCX: ffff8c004ad75a00
[ 1077.525971] RDX: ffff8c004ad75a00 RSI: ffffdc88c42b5d00 RDI: ffff8c0040043600
[ 1077.525971] RBP: ffffb899c562f980 R08: 0000000000000001 R09: ffffffffaca1f588
[ 1077.525972] R10: 0000000000000274 R11: 0000000000000272 R12: ffffdc88c42b5d00
[ 1077.525973] R13: ffff8c004ad75a00 R14: ffff8c0040043600 R15: ffff8c004ad75a00
[ 1077.525974] FS:  00007f4526cb6b80(0000) GS:ffff8c0f2eb80000(0000) knlGS:0000000000000000
[ 1077.525975] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1077.525975] CR2: 000056159e259cb0 CR3: 000000012da5a000 CR4: 0000000000750ee0
[ 1077.525976] PKRU: 55555554
[ 1077.525977] Call Trace:
[ 1077.525983]  ? schedule+0x46/0xb0
[ 1077.525985]  ? pcpu_free_area+0x21c/0x370
[ 1077.525987]  kfree+0x438/0x480
[ 1077.525991]  ? free_netdev+0x18/0x140
[ 1077.525992]  free_netdev+0x18/0x140
[ 1077.525994]  netdev_run_todo+0x2ee/0x330
[ 1077.526016]  ? rtw_set_rtnl_lock_holder+0xd/0x30 [88XXau]
[ 1077.526029]  nl80211_post_doit+0x62/0x70 [cfg80211]
[ 1077.526033]  genl_family_rcv_msg_doit+0x106/0x150
[ 1077.526035]  genl_rcv_msg+0xdc/0x1e0
[ 1077.526045]  ? nl80211_parse_wowlan_tcp+0x460/0x460 [cfg80211]
[ 1077.526047]  ? genl_get_cmd+0xd0/0xd0
[ 1077.526048]  netlink_rcv_skb+0x50/0xf0
[ 1077.526050]  genl_rcv+0x24/0x40
[ 1077.526051]  netlink_unicast+0x201/0x2d0
[ 1077.526052]  netlink_sendmsg+0x23a/0x470
[ 1077.526054]  ? _copy_from_user+0x3c/0x80
[ 1077.526056]  sock_sendmsg+0x5e/0x60
[ 1077.526058]  ____sys_sendmsg+0x22c/0x270
[ 1077.526059]  ? import_iovec+0x17/0x20
[ 1077.526060]  ? sendmsg_copy_msghdr+0x79/0xa0
[ 1077.526063]  ? mntput_no_expire+0x47/0x270
[ 1077.526064]  ___sys_sendmsg+0x81/0xc0
[ 1077.526066]  ? __mod_memcg_lruvec_state+0x21/0xe0
[ 1077.526067]  ? kmem_cache_free+0x274/0x410
[ 1077.526069]  ? __sk_destruct+0x148/0x200
[ 1077.526070]  ? __mod_memcg_lruvec_state+0x21/0xe0
[ 1077.526071]  ? kmem_cache_free+0x274/0x410
[ 1077.526073]  ? __dentry_kill+0x138/0x180
[ 1077.526075]  __sys_sendmsg+0x59/0xa0
[ 1077.526077]  do_syscall_64+0x33/0x40
[ 1077.526079]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1077.526081] RIP: 0033:0x7f4526dd8cc7
[ 1077.526082] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[ 1077.526083] RSP: 002b:00007ffe618f1d38 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 1077.526085] RAX: ffffffffffffffda RBX: 000055fc9db2b390 RCX: 00007f4526dd8cc7
[ 1077.526085] RDX: 0000000000000000 RSI: 00007ffe618f1d70 RDI: 0000000000000003
[ 1077.526086] RBP: 000055fc9db308c0 R08: 000055fc9db2b2a0 R09: 0000000000000000
[ 1077.526087] R10: 000055fc9dae0f80 R11: 0000000000000246 R12: 000055fc9db30780
[ 1077.526087] R13: 00007ffe618f1d70 R14: 000055fc9db307d0 R15: 000055fc9db308c0
[ 1077.526089] Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc overlay nct6775 hwmon_vid nls_iso8859_1 nvidia_drm(POE) vfat nvidia_modeset(POE) 88XXau(OE) fat uvcvideo videobuf2_vmalloc snd_usb_audio videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usbmidi_lib videodev snd_rawmidi uas snd_seq_device cfg80211 nvidia(POE) mousedev usb_storage mc joydev rfkill igb i2c_algo_bit dca ucsi_ccg typec_ucsi typec wmi_bmof snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation soundwire_cadence snd_hda_codec snd_hda_core edac_mce_amd snd_hwdep kvm_amd soundwire_bus ccp rng_core snd_soc_core kvm snd_compress ac97_bus snd_pcm_dmaengine snd_pcm usbhid drm_kms_helper snd_timer snd irqbypass cec crct10dif_pclmul soundcore crc32_pclmul ghash_clmulni_intel
[ 1077.526125]  syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops sp5100_tco i2c_nvidia_gpu crypto_simd i2c_piix4 cryptd k10temp glue_helper rapl wmi pcspkr mac_hid pinctrl_amd acpi_cpufreq drm fuse agpgart bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 crc32c_intel xhci_pci xhci_pci_renesas
[ 1077.526141] ---[ end trace 04aa50e480130b87 ]---
[ 1077.526143] RIP: 0010:__slab_free+0x213/0x430
[ 1077.526144] Code: 44 24 20 e8 0f fc ff ff 44 8b 44 24 20 85 c0 0f 85 3f fe ff ff eb b2 41 f7 46 08 00 0d 21 00 0f 85 1f ff ff ff e9 11 ff ff ff <0f> 0b 80 4c 24 5b 80 45 31 c9 e9 7e fe ff ff f3 90 49 8b 04 24 a8
[ 1077.526145] RSP: 0018:ffffb899c515f8d0 EFLAGS: 00010246
[ 1077.526145] RAX: ffff8c004e3b4500 RBX: 0000000080200016 RCX: ffff8c004e3b4400
[ 1077.526146] RDX: ffff8c004e3b4400 RSI: ffffdc88c438ed00 RDI: ffff8c0040043600
[ 1077.526147] RBP: ffffb899c515f980 R08: 0000000000000001 R09: ffffffffaca1f588
[ 1077.526148] R10: 0000000000000131 R11: 0000000000000130 R12: ffffdc88c438ed00
[ 1077.526148] R13: ffff8c004e3b4400 R14: ffff8c0040043600 R15: ffff8c004e3b4400
[ 1077.526149] FS:  00007f4526cb6b80(0000) GS:ffff8c0f2eb80000(0000) knlGS:0000000000000000
[ 1077.526150] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1077.526151] CR2: 000056159e259cb0 CR3: 000000012da5a000 CR4: 0000000000750ee0
[ 1077.526152] PKRU: 55555554
quantatic commented 3 years ago

Here's the relevant dmesg output on 5.12.14, produced with sudo airmon-ng start wlp5s0f3u3.

[  245.349032] INFO: task iw:2671 blocked for more than 122 seconds.
[  245.349034]       Tainted: P           OE     5.12.14-arch1-1 #1
[  245.349034] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  245.349035] task:iw              state:D stack:    0 pid: 2671 ppid:  2670 flags:0x00000000
[  245.349037] Call Trace:
[  245.349038]  __schedule+0x2ff/0x8b0
[  245.349040]  schedule+0x5b/0xc0
[  245.349042]  schedule_preempt_disabled+0x11/0x20
[  245.349044]  __mutex_lock.constprop.0+0x2f8/0x4e0
[  245.349048]  cfg80211_netdev_notifier_call+0x104/0x4f0 [cfg80211]
[  245.349069]  raw_notifier_call_chain+0x44/0x60
[  245.349072]  register_netdevice+0x4ee/0x5f0
[  245.349075]  cfg80211_rtw_set_default_mgmt_key+0x1f92/0x3f60 [88XXau]
[  245.349107]  nl80211_new_interface+0x1b5/0x4b0 [cfg80211]
[  245.349129]  genl_family_rcv_msg_doit+0xfd/0x160
[  245.349132]  genl_rcv_msg+0xeb/0x1e0
[  245.349134]  ? nl80211_get_interface+0x90/0x90 [cfg80211]
[  245.349151]  ? genl_get_cmd+0xd0/0xd0
[  245.349153]  netlink_rcv_skb+0x5b/0x100
[  245.349155]  genl_rcv+0x24/0x40
[  245.349157]  netlink_unicast+0x23e/0x350
[  245.349159]  netlink_sendmsg+0x23a/0x470
[  245.349162]  sock_sendmsg+0x5e/0x60
[  245.349164]  ____sys_sendmsg+0x258/0x2a0
[  245.349166]  ___sys_sendmsg+0xa3/0xf0
[  245.349169]  __sys_sendmsg+0x81/0xd0
[  245.349171]  do_syscall_64+0x33/0x40
[  245.349173]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  245.349175] RIP: 0033:0x7f215203fcc7
[  245.349176] RSP: 002b:00007fff4b050988 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  245.349177] RAX: ffffffffffffffda RBX: 0000560d86945390 RCX: 00007f215203fcc7
[  245.349178] RDX: 0000000000000000 RSI: 00007fff4b0509c0 RDI: 0000000000000003
[  245.349178] RBP: 0000560d8694a8c0 R08: 0000560d869452a0 R09: 00007fff4b050a2c
[  245.349179] R10: 00007fff4b050ca8 R11: 0000000000000246 R12: 0000560d8694a780
[  245.349180] R13: 00007fff4b0509c0 R14: 0000560d8694a7d0 R15: 0000560d8694a8c0
[  245.349182] INFO: task systemd-udevd:2672 blocked for more than 122 seconds.
[  245.349184]       Tainted: P           OE     5.12.14-arch1-1 #1
[  245.349185] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  245.349185] task:systemd-udevd   state:D stack:    0 pid: 2672 ppid:   507 flags:0x00004220
[  245.349187] Call Trace:
[  245.349188]  __schedule+0x2ff/0x8b0
[  245.349191]  schedule+0x5b/0xc0
[  245.349193]  schedule_preempt_disabled+0x11/0x20
[  245.349195]  __mutex_lock.constprop.0+0x2f8/0x4e0
[  245.349196]  ? netdev_name_node_lookup_rcu+0x67/0x80
[  245.349198]  dev_ioctl+0x182/0x4f0
[  245.349201]  sock_do_ioctl+0xee/0x190
[  245.349204]  sock_ioctl+0x278/0x360
[  245.349207]  __x64_sys_ioctl+0x82/0xb0
[  245.349210]  do_syscall_64+0x33/0x40
[  245.349212]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  245.349213] RIP: 0033:0x7f7a5224259b
[  245.349214] RSP: 002b:00007ffc0ebbf4b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  245.349215] RAX: ffffffffffffffda RBX: 0000562e978e8698 RCX: 00007f7a5224259b
[  245.349216] RDX: 00007ffc0ebbf4c0 RSI: 0000000000008946 RDI: 0000000000000006
[  245.349217] RBP: 00007ffc0ebbf668 R08: 0000000000000010 R09: 0000000000000001
[  245.349218] R10: 00007f7a519168e8 R11: 0000000000000246 R12: 0000562e978f5c80
[  245.349218] R13: 00007ffc0ebbf4c0 R14: 00007ffc0ebbf750 R15: 0000562e97ad9a90
quantatic commented 3 years ago

I dig a bit of digging, and it seems this issue may be related to the changes made in https://github.com/torvalds/linux/commit/2fe8ef106238b274c505c480ecf00d8765abf0d8.

noizy-sthlm commented 3 years ago

I have the same problem when using v5.6.4.2_35491.20191025 on Ubuntu 20.04.3 LTS. If someone have a solution please comment.

ptpt52 commented 1 year ago

same issue.

ptpt52 commented 1 year ago

same issue

ntzb commented 5 months ago

with the clue from @quantatic, I found that a similar issue (hang) on a different realtek driver rtw8852cu, can be solved by replacing register_netdevice with cfg80211_register_netdevice and unregister_netdevice with cfg80211_unregister_netdevice: https://github.com/ntzb/rtw8852cu/commit/891e3db8f525a6fb2d65e3ad928fd4a046e8d40a. maybe it could be relevant in this case as well