Closed Sak1s closed 1 year ago
Have also seen the same kernel OOPS as below:
[ 76.278345] Unable to handle kernel paging request at virtual address ffff80001136700c [ 76.278352] Mem abort info: [ 76.278354] ESR = 0x96000021 [ 76.278358] EC = 0x25: DABT (current EL), IL = 32 bits [ 76.278363] SET = 0, FnV = 0 [ 76.278366] EA = 0, S1PTW = 0 [ 76.278369] Data abort info: [ 76.278371] ISV = 0, ISS = 0x00000021 [ 76.278373] CM = 0, WnR = 0 [ 76.278377] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000008f63000 [ 76.278383] [ffff80001136700c] pgd=000000000efff003, p4d=000000000efff003, pud=000000000effe003, pmd=00000000016fb003, pte=0068000002b76703 [ 76.278405] Internal error: Oops: 96000021 [#1] PREEMPT_RT SMP [ 76.278412] Modules linked in: ipv6 usb_f_uac2 u_audio libcomposite 8723ds(O) hci_uart btrtl xhci_plat_hcd btbcm meson_gxbb_wdt watchdog xhci_pci xhci_hcd meson_rng rng_core [ 76.278464] CPU: 2 PID: 25 Comm: ksoftirqd/2 Tainted: G O 5.10.17-rt32-yocto-preempt-rt-rode #1 [ 76.278479] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--) [ 76.278486] pc : rt_spin_lock+0x1c/0x60 [ 76.278511] lr : update_sta_info+0x80/0x144 [8723ds] [ 76.278868] sp : ffff8000110a39a0 [ 76.278871] x29: ffff8000110a39a0 x28: ffff800010df9000 [ 76.278878] x27: 000000000007a120 x26: 00007d0000006d60 [ 76.278885] x25: ffff800011366ff4 x24: ffff8000113253b8 [ 76.278892] x23: ffff8000112270e8 x22: ffff00000191ec34 [ 76.278899] x21: ffff800011227030 x20: ffff800011227000 [ 76.278905] x19: ffff800011366ff4 x18: ffff800010e13a48 [ 76.278912] x17: 0000000000000000 x16: 0000000000000000 [ 76.278918] x15: ffff0000053f2f40 x14: ffffffffffffffff [ 76.278925] x13: fffffffffffcfd07 x12: ffff0000053f2ac0 [ 76.278932] x11: 0000000000000000 x10: ffff800010e31370 [ 76.278939] x9 : 00000000fffffffe x8 : 0000000000000000 [ 76.278945] x7 : 0000000000000000 x6 : ffff800011367df4 [ 76.278952] x5 : ffff80001136700c x4 : 0000000000000500 [ 76.278958] x3 : ffff7ffffe150000 x2 : ffff0000053f2ac0 [ 76.278965] x1 : 0000000000000000 x0 : ffff800011366ff4 [ 76.278974] Call trace: [ 76.278978] rt_spin_lock+0x1c/0x60 [ 76.278993] update_sta_info+0x80/0x144 [8723ds] [ 76.279149] rtw_joinbss_event_prehandle+0x39c/0x618 [8723ds] [ 76.279303] report_join_res+0xb8/0x114 [8723ds] [ 76.279456] OnAssocRsp+0x268/0x288 [8723ds] [ 76.279608] _mgt_dispatcher+0x80/0xd0 [8723ds] [ 76.279759] mgt_dispatcher+0x1a8/0x25c [8723ds] [ 76.279912] validate_recv_mgnt_frame+0x7c/0x154 [8723ds] [ 76.280063] validate_recv_frame+0x24c/0x3b0 [8723ds] [ 76.280216] recv_func_prehandle+0x44/0x8c [8723ds] [ 76.280369] recv_func+0x34/0x16c [8723ds] [ 76.280522] rtw_recv_entry+0x20/0x5c [8723ds] [ 76.280674] rtl8723ds_recv_tasklet+0x2d0/0x374 [8723ds] [ 76.280826] tasklet_action_common.isra.0+0x140/0x150 [ 76.280844] tasklet_action+0x28/0x38 [ 76.280850] _stext+0x108/0x200 [ 76.280857] run_ksoftirqd+0x50/0xb8 [ 76.280863] smpboot_thread_fn+0x2bc/0x2f8 [ 76.280873] kthread+0x178/0x1a0 [ 76.280883] ret_from_fork+0x10/0x30 [ 76.280898] Code: 910003fd d2800001 d5384102 f98000b1 (c85ffca3) [ 76.280913] ---[ end trace 0000000000000002 ]---
Note that I do not have this device, and I only provide the Realtek driver. I will, however, try to help.
Does the error also happen with the regular, non RT, kernel?
I have a bit of a problem with your traceback. It shows the error in rt_spin_lock called from update_sta_info, which is called from rtw_joinbss_event_prehandle. Unfortunately, rtw_joinbss_event_prehandle does not call update_sta_info. It is possible that there has been memory corruption. I ran most of the code through smatch, which will catch errors such as under-sized arrays, etc. I fixed and pushed a few minor fixes. None of them should matter.
Hi @lwfinger . Thanks for the response! I believe from the original issue submitter that it did not effect normal or PREEMPT kernels - only fully realtime PREEMPT kernel which I am also running. I can try normal kernel build though as well with some time.
I'll give the minor fixes you pushed a go to see. Odd the stack trace shows that call jump.....
Complete side question: does your rtw88 driver support 8821CS yet?
First of all, it is NOT my driver. I am taking the kernel code from Realtek, porting it into a stand-alone external driver, and modifying it to build with older kernels. To my knowledge, rtw88 does not support any SDIO versions. There is code for some USB versions, but nothing in the kernel.
Apologies, slip of the tounge, grateful for the support!
Reliance on supplier drivers for the 8821cs seems like the only way forward for now.
Will continue to try to debug this issue on the 8723 for RT.
Tracing this using printk's:
[ 45.290054] RTW: bssrate_len = 12 [ 45.294072] RTW: OnAuthClient [ 45.305809] RTW: OnAssocRsp [ 45.305855] RTW: report_join_res(7) [ 45.305866] RTW: rtw_joinbss_update_network [ 45.305881] RTW: +rtw_update_ht_cap() [ 45.305902] RTW: rtw_alloc_macid(wlan0) if1, hwaddr:86:83:c2:b1:c3:7c macid:0 [ 45.305913] RTW: rtw_joinbss_update_stainfo [ 45.305916] RTW: update_sta_info
Looks like an issue with the critical section entry using spinlocks at the bottom of this update_sta_info function called from rtw_joinbss_update_stainfo. Appears to continue with other spinlocks used if I comment out these critical section entry/exit commanda.
Thanks for the flow information. It might be dangerous to eliminate that locking, but I think we might replace that _enter_critical_bh(), which calls spin_lock_bh(), with _enter_critical_ex(), which calls spin_lock_irqsave(). I pushed this change.
No problem! Applied the change but unfortunately same kernel OOPS - I will double check that there wasn't an error in my build system.
I did seperately have some succces with dropping in raw_spinlock_t instead of spinlock_t usage (RT PREEMPT kernels do not automatically translate spinlock_t to raw_spinlock_t). Will need to figure out the implications, but the attached patch file fixes all the kernel OOPS issues for me and allows WIFI association and DHCP startup.
Could you check the .config used to generate your kernel? In particular, does it have a line CONFIG_PREEMPT_RT=y? If so, I know how to modify the code to work with regular, or RT kernels.
CONFIG_HAVE_PREEMPT_LAZY=y CONFIG_PREEMPT_LAZY=y CONFIG_PREEMPT_NONE is not set CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT is not set CONFIG_PREEMPT_RT=y CONFIG_PREEMPT_COUNT=y CONFIG_PREEMPTION=y
Thanks. I just push a change that will let the code determine which form of spinlock to use depending on whether CONFIG_PREEMPT_RT is defined. Thus one version of the source should work on both kinds of systems.
Awesome thanks, glad we could help each other! Have you pushed your change to remote? I can't see any more commits on GITHUB.
Yes, the code was pushed. The latest commit message should be: commit 07152ef10d8491ac04d8d29eb5df459a87b34e81 (HEAD -> v5.2.2.4, origin/v5.2.2.4) Author: Larry Finger Larry.Finger@lwfinger.net Date: Tue Mar 23 19:32:46 2021 -0500
rtl8188eu: Use CONFIG_PREEMPT_RT to select raw spin lock form for real-time kernels
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Oh that is where we are confused! This PREEMPT_RT question is for rtl8723ds - I can port across from 8188eu if needed.
Now I got the right driver. Sorry about that.
I saw failures with the same stack trace on Linux v6.1-rc7-rt5 but the exception logging is clearer that the problem is an alignment exception. #30 fixes the problem for me (also on an aarch64 platform).
Good. That patch is merged.
Hello everyone
I tried your driver successfully with Preemptible Kernel 5.9 with no problems.Everything woks fine
If i build with "Fully Preemptible Kernel (Real-Time)" i get a kernel panic after the wifi association
Have anyone experienced to to use this driver with RT-Kernel ?
[ 12.329412] RTW: wlan0- hw port(0) mac_addr =00:e0:4c:c4:61:95 [ 14.260075] RTW: rtw_set_802_11_connect(wlan0) fw_state=0x00000008 [ 14.718594] RTW: start auth [ 14.723620] RTW: auth success, start assoc [ 14.731559] Unable to handle kernel paging request at virtual address ffff8000129bf01c [ 14.732995] Mem abort info: [ 14.733260] ESR = 0x96000021 [ 14.733547] EC = 0x25: DABT (current EL), IL = 32 bits [ 14.734036] SET = 0, FnV = 0 [ 14.734322] EA = 0, S1PTW = 0 [ 14.734617] Data abort info: [ 14.734883] ISV = 0, ISS = 0x00000021 [ 14.735235] CM = 0, WnR = 0 [ 14.735517] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000001680000 [ 14.736130] [ffff8000129bf01c] pgd=000000001ffff003, p4d=000000001ffff003, pud=000000001fffe003, pmd=000000000395c003, pte=0068000006e86703 [ 14.737340] Internal error: Oops: 96000021 [#1] PREEMPT_RT SMP [ 14.737885] Modules linked in: realtek 8723ds dwmac_rk stmmac_platform snd_soc_simple_card snd_soc_pcm5102a snd_soc_rk3308 cfg80211 snd_soc_rockchip_i2s_tdm stmmac snd_soc_simple_card_utils pcs_xpcs rfkill snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore cpufreq_dt ip_tables x_tables autofs4 [ 14.740483] CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 5.10.1-rt19 #2 [ 14.741096] Hardware name: Radxa ROCK Pi S (DT) [ 14.741519] Workqueue: events sdio_irq_work [ 14.741939] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) [ 14.742495] pc : rt_spin_unlock+0xa8/0x130 [ 14.742892] lr : update_sta_info+0x80/0x144 [8723ds] [ 14.744129] sp : ffff800011a1b490 [ 14.744438] x29: ffff800011a1b490 x28: ffff8000115ca000 [ 14.744944] x27: 000000000007a120 x26: ffff8000129bf004 [ 14.745441] x25: ffff800011dc2da0 x24: ffff800011dc10e8 [ 14.745943] x23: ffff800012955b28 x22: ffff000006cac434 [ 14.746443] x21: ffff800011dc1030 x20: ffff800011dc1000 [ 14.746943] x19: ffff8000129bf004 x18: ffffffffffffffff [ 14.747443] x17: 0000000000000007 x16: 0000000000000001 [ 14.747942] x15: 000002bcf731a530 x14: 0000000000000000 [ 14.748440] x13: ffff000000000000 x12: ffffffffffffffff [ 14.748940] x11: 0000000000000001 x10: 0000000000000000 [ 14.749438] x9 : 0000000000000000 x8 : 0000000000000000 [ 14.749938] x7 : 0000000000000000 x6 : ffff8000129bfe04 [ 14.750435] x5 : ffff8000129bf01c x4 : ffff80000eadd000 [ 14.750935] x3 : ffff8000129bf004 x2 : ffff000002374b00 [ 14.751435] x1 : 0000000000000000 x0 : ffff8000129bf004 [ 14.751937] Call trace: [ 14.752173] rt_spin_unlock+0xa8/0x130 [ 14.752546] update_sta_info+0x80/0x144 [8723ds] [ 14.753669] rtw_joinbss_event_prehandle+0x36c/0x620 [8723ds] [ 14.754883] report_join_res+0xbc/0x11c [8723ds] [ 14.755993] OnAssocRsp+0x114/0x2b0 [8723ds] [ 14.757069] _mgt_dispatcher+0x88/0xe4 [8723ds] [ 14.758169] mgt_dispatcher+0x134/0x260 [8723ds] [ 14.759282] validate_recv_mgnt_frame+0x7c/0x154 [8723ds] [ 14.760454] validate_recv_frame+0x36c/0x3f8 [8723ds] [ 14.761595] recv_func_prehandle+0x44/0x8c [8723ds] [ 14.762726] recv_func+0x34/0x19c [8723ds] [ 14.763795] rtw_recv_entry+0x20/0x5c [8723ds] [ 14.764898] rtl8723ds_recv_tasklet+0x2d4/0x380 [8723ds] [ 14.766085] tasklet_action_common.isra.23+0x138/0x168 [ 14.766583] tasklet_action+0x38/0x48 [ 14.766940] efi_header_end+0x138/0x3e8 [ 14.767314] __local_bh_enable_ip+0x1a0/0x1b8 [ 14.767732] dw_mci_request+0x98/0x120 [ 14.768106] __mmc_start_request+0x7c/0x200 [ 14.768513] mmc_start_request+0x94/0xc0 [ 14.768894] mmc_wait_for_req+0x74/0xf8 [ 14.769264] mmc_wait_for_cmd+0x6c/0xa0 [ 14.769637] mmc_io_rw_direct_host+0x90/0x140 [ 14.770057] mmc_io_rw_direct+0x14/0x20 [ 14.770428] sdio_readb+0x48/0xa0 [ 14.770755] _sd_cmd52_read+0x7c/0x11c [8723ds] [ 14.771880] sd_cmd52_read+0x6c/0xc0 [8723ds] [ 14.772978] SdioLocalCmd52Read1Byte+0x40/0x68 [8723ds] [ 14.774142] ReadInterrupt8723DSdio+0x6c/0xb8 [8723ds] [ 14.775293] sd_int_dpc+0x298/0x50c [8723ds] [ 14.776381] sd_int_hdl+0x84/0xb0 [8723ds] [ 14.777454] sd_sync_int_hdl+0x30/0x78 [8723ds] [ 14.778562] process_sdio_pending_irqs+0x60/0x1f0 [ 14.779024] sdio_irq_work+0x50/0x80 [ 14.779375] process_one_work+0x1ec/0x4f8 [ 14.779772] worker_thread+0x44/0x478 [ 14.780134] kthread+0x168/0x188 [ 14.780456] ret_from_fork+0x10/0x34 [ 14.780827] Code: c8047ca2 35ffff84 17ffff35 f98000b1 (c85ffca0) [ 14.781400] ---[ end trace 0000000000000002 ]--- [ 14.782324] Kernel panic - not syncing: [ 14.782679] Oops: Fatal exception in interrupt [ 14.783163] SMP: stopping secondary CPUs [ 14.783955] Kernel Offset: disabled [ 14.784276] CPU features: 0x0040002,20002000 [ 14.784676] Memory Limit: none [ 14.784970] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---