greearb / ath10k-ct

Stand-alone ath10k driver based on Candela Technologies Linux kernel.
111 stars 40 forks source link

Stacktrace from OpenWRT 5.4.35 with ath10k-ct-smallbuffers build #128

Closed Fail-Safe closed 4 years ago

Fail-Safe commented 4 years ago

Not sure if this is related to firmware or not. If this looks like a driver issue I will take it up with the OpenWRT forums.

Description of the problem (how to configure, how to reproduce, how often it happens). Seeing an occasional stacktrace pop up in my kernel logs. My Netgear R7800 recovers, but there is a several second (3-4 perhaps) drop in WiFi connectivity.

Software (OS, Firmware version, kernel, driver, etc) OS: OpenWrt SNAPSHOT r13097-1c008b61bd / LuCI Master git-20.114.19431-d0518a1 Kernel: 5.4.35 Firmware version: firmware-5-ct-htt-mgt-community.bin 04-24-20 release Driver: ath10k-ct-smallbuffers driver + patch as proposed by Dave Taht Here

Hardware (NIC chipset, platform, etc) Netgear Nighthawk X4S R7800, QCA9884, arm_cortex-a15_neon-vfpv4

Logs (dmesg, maybe supplicant and/or hostap)

[ 6222.345161] ------------[ cut here ]------------
[ 6222.345244] WARNING: CPU: 0 PID: 0 at target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/ath10k-ct-smallbuffers/ath10k-ct-2020-03-25-3d173a47/ath10k-5.4/txrx.c:134 ath10k_txrx_tx_unref+0x574/0x738 [ath10k_core]
[ 6222.348862] Invalid VHT rate, nss: 3  hw_rate: 15 ratecode: 255
[ 6222.368746] Modules linked in: iptable_nat ath10k_pci ath10k_core ath xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE xt_FLOWOFFLOAD nf_nat nf_flow_table_hw nf_flow_table nf_conntrack_rtcache nf_conntrack mac80211 iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211 xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables nf_reject_ipv4 nf_log_ipv4 nf_log_common nf_defrag_ipv6 nf_defrag_ipv4 crc_ccitt compat ledtrig_usbport ledtrig_heartbeat msdos vfat fat hfsplus nls_utf8 nls_iso8859_1 nls_cp437 usb_storage leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom ohci_platform ohci_hcd phy_qcom_dwc3 ahci fsl_mph_dr_of ehci_platform ehci_fsl sd_mod ahci_platform libahci_platform libahci libata scsi_mod ehci_hcd gpio_button_hotplug ext4 mbcache jbd2 exfat(C) crc32c_generic
[ 6222.424724] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C        5.4.35 #0
[ 6222.446928] Hardware name: Generic DT based system
[ 6222.454230] [<c030f954>] (unwind_backtrace) from [<c030b96c>] (show_stack+0x14/0x20)
[ 6222.459093] [<c030b96c>] (show_stack) from [<c08a7e60>] (dump_stack+0x94/0xa8)
[ 6222.466994] [<c08a7e60>] (dump_stack) from [<c031e7c0>] (__warn+0xb4/0xd0)
[ 6222.474018] [<c031e7c0>] (__warn) from [<c031e85c>] (warn_slowpath_fmt+0x80/0x90)
[ 6222.480896] [<c031e85c>] (warn_slowpath_fmt) from [<bf427718>] (ath10k_txrx_tx_unref+0x574/0x738 [ath10k_core])
[ 6222.488496] [<bf427718>] (ath10k_txrx_tx_unref [ath10k_core]) from [<bf4217d4>] (ath10k_htt_t2h_msg_handler+0xe08/0x11dc [ath10k_core])
[ 6222.498355] [<bf4217d4>] (ath10k_htt_t2h_msg_handler [ath10k_core]) from [<bf479768>] (ath10k_pci_htt_rx_cb+0x178/0x230 [ath10k_pci])
[ 6222.510512] [<bf479768>] (ath10k_pci_htt_rx_cb [ath10k_pci]) from [<bf441164>] (ath10k_ce_per_engine_service+0x9c/0x10c [ath10k_core])
[ 6222.522674] [<bf441164>] (ath10k_ce_per_engine_service [ath10k_core]) from [<bf441254>] (ath10k_ce_per_engine_service_any+0x80/0xd8 [ath10k_core])
[ 6222.534637] [<bf441254>] (ath10k_ce_per_engine_service_any [ath10k_core]) from [<bf47b104>] (ath10k_pci_napi_poll+0x54/0x15c [ath10k_pci])
[ 6222.547730] [<bf47b104>] (ath10k_pci_napi_poll [ath10k_pci]) from [<c07668f0>] (net_rx_action+0x118/0x374)
[ 6222.560134] [<c07668f0>] (net_rx_action) from [<c0302298>] (__do_softirq+0x130/0x2d4)
[ 6222.569767] [<c0302298>] (__do_softirq) from [<c0322ba4>] (irq_exit+0xbc/0xe0)
[ 6222.577663] [<c0322ba4>] (irq_exit) from [<c036cea8>] (__handle_domain_irq+0x6c/0xd0)
[ 6222.584789] [<c036cea8>] (__handle_domain_irq) from [<c05ba0dc>] (gic_handle_irq+0x5c/0xb8)
[ 6222.592683] [<c05ba0dc>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90)
[ 6222.600838] Exception stack(0xc0c01ee0 to 0xc0c01f28)
[ 6222.608483] 1ee0: 00000000 000005a8 1ce53000 dd991a00 dcc29400 00000000 dd990df0 000005a8
[ 6222.613521] 1f00: 000005a8 00000000 c0abe780 c09cf680 00000015 c0c01f30 c0709f50 c0709f54
[ 6222.621673] 1f20: a0000013 ffffffff
[ 6222.629831] [<c0301a8c>] (__irq_svc) from [<c0709f54>] (cpuidle_enter_state+0x94/0x498)
[ 6222.633133] [<c0709f54>] (cpuidle_enter_state) from [<c070a39c>] (cpuidle_enter+0x30/0x4c)
[ 6222.641120] [<c070a39c>] (cpuidle_enter) from [<c034a5d4>] (do_idle+0x1d8/0x240)
[ 6222.649453] [<c034a5d4>] (do_idle) from [<c034a8e4>] (cpu_startup_entry+0x1c/0x20)
[ 6222.657008] [<c034a8e4>] (cpu_startup_entry) from [<c0b00e5c>] (start_kernel+0x4dc/0x4e8)
[ 6222.664447] ---[ end trace 7c57890e0a9f0d77 ]---
[ 6222.673000] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 1, skipped old beacon
[ 6222.677355] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 2, skipped old beacon
[ 6222.684425] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon
[ 6222.692648] ath10k_pci 0001:01:00.0: SWBA overrun on vdev 0, skipped old beacon
[ 6222.699002] ath10k_pci 0001:01:00.0: SWBA overrun on vdev 1, skipped old beacon
[ 6222.706315] ath10k_pci 0001:01:00.0: SWBA overrun on vdev 2, skipped old beacon
[ 6222.713591] ath10k_pci 0001:01:00.0: SWBA overrun on vdev 0, skipped old beacon
[61594.521094] ------------[ cut here ]------------
[61594.521200] WARNING: CPU: 0 PID: 0 at backports-5.4.27-1/net/mac80211/sta_info.c:1938 ieee80211_sta_update_pending_airtime+0x200/0x204 [mac80211]
[61594.524847] STA 44:61:32:82:ec:e6 AC 2 txq pending airtime underflow: 4294966724, 572
[61594.524849] Modules linked in: iptable_nat ath10k_pci ath10k_core ath xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE xt_FLOWOFFLOAD nf_nat nf_flow_table_hw nf_flow_table nf_conntrack_rtcache nf_conntrack mac80211 iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211 xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables nf_reject_ipv4 nf_log_ipv4 nf_log_common nf_defrag_ipv6 nf_defrag_ipv4 crc_ccitt compat ledtrig_usbport ledtrig_heartbeat msdos vfat fat hfsplus nls_utf8 nls_iso8859_1 nls_cp437 usb_storage leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom ohci_platform ohci_hcd phy_qcom_dwc3 ahci fsl_mph_dr_of ehci_platform ehci_fsl sd_mod ahci_platform libahci_platform libahci libata scsi_mod ehci_hcd gpio_button_hotplug ext4 mbcache jbd2 exfat(C) crc32c_generic
[61594.595772] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        WC        5.4.35 #0
[61594.617914] Hardware name: Generic DT based system
[61594.625216] [<c030f954>] (unwind_backtrace) from [<c030b96c>] (show_stack+0x14/0x20)
[61594.630079] [<c030b96c>] (show_stack) from [<c08a7e60>] (dump_stack+0x94/0xa8)
[61594.637979] [<c08a7e60>] (dump_stack) from [<c031e7c0>] (__warn+0xb4/0xd0)
[61594.645003] [<c031e7c0>] (__warn) from [<c031e85c>] (warn_slowpath_fmt+0x80/0x90)
[61594.651904] [<c031e85c>] (warn_slowpath_fmt) from [<bf2f7754>] (ieee80211_sta_update_pending_airtime+0x200/0x204 [mac80211])
[61594.659505] [<bf2f7754>] (ieee80211_sta_update_pending_airtime [mac80211]) from [<bf2f1fc0>] (ieee80211_report_low_ack+0x22c/0x4e4 [mac80211])
[61594.670770] [<bf2f1fc0>] (ieee80211_report_low_ack [mac80211]) from [<bf2f228c>] (ieee80211_free_txskb+0x14/0x2c [mac80211])
[61594.683371] [<bf2f228c>] (ieee80211_free_txskb [mac80211]) from [<bf4277ac>] (ath10k_txrx_tx_unref+0x608/0x738 [ath10k_core])
[61594.694702] [<bf4277ac>] (ath10k_txrx_tx_unref [ath10k_core]) from [<bf4217d4>] (ath10k_htt_t2h_msg_handler+0xe08/0x11dc [ath10k_core])
[61594.705886] [<bf4217d4>] (ath10k_htt_t2h_msg_handler [ath10k_core]) from [<bf479768>] (ath10k_pci_htt_rx_cb+0x178/0x230 [ath10k_pci])
[61594.717860] [<bf479768>] (ath10k_pci_htt_rx_cb [ath10k_pci]) from [<bf441164>] (ath10k_ce_per_engine_service+0x9c/0x10c [ath10k_core])
[61594.730029] [<bf441164>] (ath10k_ce_per_engine_service [ath10k_core]) from [<bf441254>] (ath10k_ce_per_engine_service_any+0x80/0xd8 [ath10k_core])
[61594.741993] [<bf441254>] (ath10k_ce_per_engine_service_any [ath10k_core]) from [<bf47b104>] (ath10k_pci_napi_poll+0x54/0x15c [ath10k_pci])
[61594.755086] [<bf47b104>] (ath10k_pci_napi_poll [ath10k_pci]) from [<c07668f0>] (net_rx_action+0x118/0x374)
[61594.767491] [<c07668f0>] (net_rx_action) from [<c0302298>] (__do_softirq+0x130/0x2d4)
[61594.777122] [<c0302298>] (__do_softirq) from [<c0322ba4>] (irq_exit+0xbc/0xe0)
[61594.785020] [<c0322ba4>] (irq_exit) from [<c036cea8>] (__handle_domain_irq+0x6c/0xd0)
[61594.792145] [<c036cea8>] (__handle_domain_irq) from [<c05ba0dc>] (gic_handle_irq+0x5c/0xb8)
[61594.800038] [<c05ba0dc>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90)
[61594.808193] Exception stack(0xc0c01ee0 to 0xc0c01f28)
[61594.815839] 1ee0: 00000000 00003805 1ce53000 dd991a00 dcc29400 00000000 dd990df0 00003805
[61594.820878] 1f00: 00003805 00000000 178bc0a0 178745c0 00000015 c0c01f30 c0709f50 c0709f54
[61594.829030] 1f20: 20000013 ffffffff
[61594.837187] [<c0301a8c>] (__irq_svc) from [<c0709f54>] (cpuidle_enter_state+0x94/0x498)
[61594.840491] [<c0709f54>] (cpuidle_enter_state) from [<c070a39c>] (cpuidle_enter+0x30/0x4c)
[61594.848476] [<c070a39c>] (cpuidle_enter) from [<c034a5d4>] (do_idle+0x1d8/0x240)
[61594.856808] [<c034a5d4>] (do_idle) from [<c034a8e4>] (cpu_startup_entry+0x1c/0x20)
[61594.864366] [<c034a8e4>] (cpu_startup_entry) from [<c0b00e5c>] (start_kernel+0x4dc/0x4e8)
[61594.871802] ---[ end trace 7c57890e0a9f0d78 ]---
[61594.880363] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon
[61594.884774] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 1, skipped old beacon
[61594.891712] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 2, skipped old beacon
[61594.899103] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon
[61594.906783] ath10k_pci 0001:01:00.0: SWBA overrun on vdev 1, skipped old beacon
[61594.913675] ath10k_pci 0001:01:00.0: SWBA overrun on vdev 2, skipped old beacon
[61594.920877] ath10k_pci 0001:01:00.0: SWBA overrun on vdev 0, skipped old beacon
[61594.928230] ath10k_pci 0001:01:00.0: SWBA overrun on vdev 1, skipped old beacon
ynezz commented 4 years ago

Possibly duplicate of #117

greearb commented 4 years ago

This is a duplicate of several other reports, for one reason or another, we get a bad tx/rx rate from firmware. It is not a real problem, I just have not found time to go tweak the driver to catch the -1 rate (0xff) and force it to be zero instead. Maybe with a single one-time one line warning since probably this is a minor firmware bug. Patches welcome...

ynezz commented 4 years ago

@greearb so something like this?

diff --git a/ath10k-5.4/wmi.c b/ath10k-5.4/wmi.c
index 1aad6dec8eb3..4197025e9917 100644
--- a/ath10k-5.4/wmi.c
+++ b/ath10k-5.4/wmi.c
@@ -26,6 +26,8 @@
 #define ATH10K_WMI_BARRIER_TIMEOUT_HZ (3 * HZ)
 #define ATH10K_WMI_DFS_CONF_TIMEOUT_HZ (HZ / 6)

+#define ATH10K_WMI_TX_BEACON_INVALID_RATE_CODE 0xff
+
 const char* cck_speed_by_idx[] = {"1Mbps", "2Mbps", "5.5Mbps", "11Mbps" };

 /* MAIN WMI cmd track */
@@ -6221,6 +6223,10 @@ static void ath10k_wmi_event_beacon_tx(struct ath10k *ar, struct sk_buff *skb)
           status == 0 ? "OK" : (status == 1 ? "XRETRY" : (status == 2 ? "DROP" : "UNKNOWN")),
           ev->mpdus_tried, ev->mpdus_failed, ev->tx_rate_code, ev->tx_rate_flags, ev->tsFlags);

+   /* workaround for possibly firmware bug */
+   if (ev->tx_rate_code == ATH10K_WMI_TX_BEACON_INVALID_RATE_CODE)
+       ev->tx_rate_code = 0;
+
    arvif = ath10k_get_arvif(ar, vdev_id);
    if (!arvif) {
        ath10k_warn(ar, "wmi-event-beacon-tx, could not find vdev for id: %u\n",
greearb commented 4 years ago

There may be several places that the invalid rate code goes up the stack, but yes in general that is what I was hoping for. Maybe add a static 'done_once' boolean and print one warning in case that case is hit so that you can look in logs to see if indeed you have caught that case?

greearb commented 4 years ago

The fix is pushed to latest openwrt.