openwrt / openwrt

This repository is a mirror of https://git.openwrt.org/openwrt/openwrt.git It is for reference only and is not active for check-ins. We will continue to accept Pull Requests here. They will be merged via staging trees then into openwrt.git.
Other
20.25k stars 10.48k forks source link

FS#3459 - 802.11s mesh kernel trace at net/core/flow_dissector.c:958 __skb_flow_dissect+0x2c0/0x149c #8328

Open openwrt-bot opened 3 years ago

openwrt-bot commented 3 years ago

mcpratt:

reproduce the problem:

  1. use build from master branch at kernel version 5.4.72

  2. set wireless interface to 802.11s and establish a connection to another mesh point

The link is still good after this, but the impact on performance is unknown.

[ 44.133721] ------------[ cut here ]------------ [ 44.138481] WARNING: CPU: 0 PID: 0 at net/core/flow_dissector.c:958 skb_flow_dissect+0x2c0/0x149c [ 44.147605] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE xt_FLOWOFFLOAD pppox ppp_generic nf_nat nf_flow_table_hw nf_flow_table nf_conntrack_rtcache nf_conntrack mac80211 ipt_REJECT cfg80211 xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG slhc nf_reject_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_filter ip_tables crc_ccitt compat nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 gpio_button_hotplug [ 44.201117] CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.72 #0 [ 44.206914] Stack : 80650000 805ee988 00000000 00000000 805edb58 81c0bc3c 806240fc 80623ce3 [ 44.215337] 80590b9c 00000000 807832d8 81788d44 80d0f418 00000001 81c0bbf0 2470403f [ 44.223761] 00000000 00000000 807b0000 000000c4 61696e74 00000000 2e342e37 32202330 [ 44.232191] 000000c4 71e00000 00000000 0003119d 00000000 00000009 00000000 8039f3b0 [ 44.240618] 00000009 81788d44 80d0f418 80620000 00000000 802feaa4 00000000 80780000 [ 44.249040] ... [ 44.251515] Call Trace: [ 44.254014] [<80069934>] show_stack+0x30/0x100 [ 44.258510] [<80082564>] warn+0xc0/0x10c [ 44.262647] [<8008260c>] warn_slowpath_fmt+0x5c/0xac [ 44.267666] [<8039f3b0>] skb_flow_dissect+0x2c0/0x149c [ 44.273027] [<803a0870>] skb_get_hash+0x7c/0x284 [ 44.278026] [<81731f08>] ieee80211_reserve_tid+0x4f0/0x1188 [mac80211] [ 44.284682] ---[ end trace 4da33b10a9de4dac ]---

openwrt-bot commented 3 years ago

msvamp:

I can confirm this issue on OpenWrt r15172-af07c6de9c (snapshot) running on netgear-r6220

[ 31.722472] ------------[ cut here ]------------ [ 31.731744] WARNING: CPU: 0 PID: 792 at net/core/flow_dissector.c:958 __skb_flow_dissect+0x2d4/0x16c4 [ 31.750167] Modules linked in: pppoe ppp_async iptable_nat xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE xt_FLOWOFFLOAD xt_CT pppox ppp_generic nf_nat nf_flow_table_hw nf_flow_table nf_conntrack_rtcache nf_conntrack mt76x2e mt76x2_common mt76x02_lib mt7603e mt76 mac80211 ipt_REJECT cfg80211 xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG slhc nf_reject_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_filter ip_tables crc_ccitt compat ledtrig_usbport nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 leds_gpio xhci_plat_hcd xhci_pci xhci_mtk xhci_hcd gpio_button_hotplug usbcore nls_base usb_common [ 31.876229] CPU: 0 PID: 792 Comm: kworker/u5:1 Not tainted 5.4.81 #0 [ 31.888888] Workqueue: napi_workq napi_workfn [ 31.897545] Stack : 8065bacc 86ca7a4c 806e0000 80720000 86ca5580 8066d684 8041b858 00000009 [ 31.914160] 00000000 86d4b618 00000000 8007d81c 00000000 00000001 86ca7a08 d517992c [ 31.930772] 00000000 00000000 00000000 00000000 716b726f 00000147 6f775f69 6e666b72 [ 31.947384] 00000000 00000001 00000000 000d9038 00000000 80740000 00000000 8041b858 [ 31.963996] 00000009 00000000 86d4b618 00000000 00000000 803525e0 00000000 80880000 [ 31.980608] ... [ 31.985459] Call Trace: [ 31.990332] [<8000b72c>] show_stack+0x30/0x100 [ 31.999182] [<805adc60>] dump_stack+0xa4/0xdc [ 32.007857] [<8002c014>] warn+0xc0/0x10c [ 32.016000] [<8002c0bc>] warn_slowpath_fmt+0x5c/0xac [ 32.025895] [<8041b858>] skb_flow_dissect+0x2d4/0x16c4 [ 32.036454] [<8041cef8>] __skb_get_hash+0x7c/0x258 [ 32.046117] [<86f32f84>] ieee80211_reserve_tid+0xf14/0x1408 [mac80211] [ 32.059541] ---[ end trace c6cce9f22f1fe70f ]---

The mesh link works normally despite this.

root@OpenWrt:~# uname -a Linux OpenWrt 5.4.81 #0 SMP Tue Dec 8 22:45:10 2020 mips GNU/Linux

openwrt-bot commented 3 years ago

jonozzz:

Still seen on:

Linux version 5.4.113 (user@4d196b72ce70) (gcc version 8.4.0 (OpenWrt GCC 8.4.0 r13314-5ff4b0d024)) #0 SMP Wed Apr 21 09:31:10 2021

https://github.com/torvalds/linux/blob/master/net/core/flow_dissector.c#L984

Seems to be just a one-time warning when net is NULL.

https://github.com/torvalds/linux/commit/9b52e3f267a6835efd50ed9002d530666d16a411

openwrt-bot commented 3 years ago

jonozzz:

From the author:

Most likely because ath10k driver (presumably the one that is doing mesh) is using dev_alloc_skb:

$ grep -ri dev_alloc_skb drivers/net/wireless/ath/ath10k drivers/net/wireless/ath/ath10k/htc.c: skb = dev_alloc_skb(ATH10K_HTC_CONTROL_BUFFER_SIZE); drivers/net/wireless/ath/ath10k/htc.c: bundle_skb = dev_alloc_skb(bundles_left); drivers/net/wireless/ath/ath10k/htc.c: bundle_skb = dev_alloc_skb(bundles_left); drivers/net/wireless/ath/ath10k/htc.c: skb = dev_alloc_skb(size + sizeof(struct ath10k_htc_hdr)); drivers/net/wireless/ath/ath10k/sdio.c: pkt->skb = dev_alloc_skb(full_len); drivers/net/wireless/ath/ath10k/sdio.c: skb = dev_alloc_skb(sizeof(*regs)); drivers/net/wireless/ath/ath10k/htt_rx.c: skb = dev_alloc_skb(HTT_RX_BUF_SIZE + HTT_RX_DESC_ALIGN); drivers/net/wireless/ath/ath10k/snoc.c: skb = dev_alloc_skb(pipe->buf_sz); drivers/net/wireless/ath/ath10k/pci.c: skb = dev_alloc_skb(pipe->buf_sz); drivers/net/wireless/ath/ath10k/usb.c: urb_context->skb = dev_alloc_skb(ATH10K_USB_RX_BUFFER_SIZE);

dev_alloc_skb doesn't properly set skb's net device (and net namespace):

static inline struct sk_buff netdev_alloc_skb(struct net_device dev, unsigned int length) ...

/ legacy helper around netdev_alloc_skb() / static inline struct sk_buff *dev_alloc_skb(unsigned int length) { return netdev_alloc_skb(NULL, length); }

And the flow dissector expects this net device to find the associated net namespace from the skb. There shouldn't really be any issues unless you want to use the BPF flow dissector and it's safe to ignore this warning.

If you'd like to properly fix it, you have to replace these dev_alloc_skb calls with netdev_alloc_skb+proper net_device.