openwrt / mt76

mac80211 driver for MediaTek MT76x0e, MT76x2e, MT7603, MT7615, MT7628 and MT7688
746 stars 342 forks source link

mt76 crash sometimes #359

Closed ptpt52 closed 4 years ago

ptpt52 commented 4 years ago

this issue last for over 6 mouths

crash log sample1:

<6>[   39.912086] pppoe-wan: renamed from ppp0
<6>[   46.716375] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
<6>[   46.729458] br-lan: port 2(wlan0) entered blocking state
<6>[   46.740136] br-lan: port 2(wlan0) entered forwarding state
<6>[ 5859.455636] mtk_soc_eth 1e100000.ethernet eth0: port 2 link up
<1>[22758.788827] CPU 1 Unable to handle kernel paging request at virtual address 07406000, epc == 80131830, ra == 801316e4
<4>[22758.810090] Oops[#1]:
<4>[22758.814650] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.169 #0
<4>[22758.826784] task: 87c2c240 task.stack: 87c54000
<4>[22758.835789] $ 0   : 00000000 00000001 00000000 81147630
<4>[22758.846200] $ 4   : 805d21fc 00000001 00000001 07406000
<4>[22758.856610] $ 8   : 000394c7 000394c6 00000000 00000001
<4>[22758.867017] $12   : 00000002 00000000 a39add2c 71d70000
<4>[22758.877425] $16   : 87c02a00 01090220 80770000 85200000
<4>[22758.887833] $20   : 00000018 00000800 867fcc00 01080020
<4>[22758.898242] $24   : a60fa8c0 00000000                  
<4>[22758.908651] $28   : 87c54000 87c0b9c8 00000000 801316e4
<4>[22758.919063] Hi    : 00002665
<4>[22758.924784] Lo    : 94af5487
<4>[22758.930526] epc   : 80131830 __kmalloc_track_caller+0x20c/0x290
<4>[22758.942304] ra    : 801316e4 __kmalloc_track_caller+0xc0/0x290
<4>[22758.953901] Status: 11007c03  KERNEL EXL IE 
<4>[22758.962233] Cause : 40800008 (ExcCode 02)
<4>[22758.970200] BadVA : 07406000
<4>[22758.975920] PrId  : 0001992f (MIPS 1004Kc)
<4>[22758.984058] Modules linked in: qcserial pppoe ppp_async option l2tp_ppp cdc_mbim usb_wwan sierra_net sierra rndis_host qmi_wwan pptp pppox ppp_mppe ppp_generic nf_nat_pptp nf_conntrack_pptp mt76x2e mt76x2_common mt76x02_lib mt7603e mt76 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE huawei_cdc_ncm cfg80211 cdc_ncm cdc_ether xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_socket xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_ipp2p xt_iface xt_hl xt_helper xt_hashlimit xt_esp xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_addrtype xt_TPROXY xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_IPMARK xt_HL xt_DSCP xt_CT xt_CLASSIFY wireguard usbserial usbnet usblp ts_fsm ts_bm slhc r8152 nft_set_rbtree
<4>[22759.124611]  nft_set_hash nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject_bridge nft_reject nft_redir nft_quota nft_objref nft_numgen nft_meta_bridge nft_meta nft_log nft_limit nft_hash nft_fwd_netdev nft_exthdr nft_dup_netdev nft_ct nft_counter nft_chain_route_ipv6 nft_chain_route_ipv4 nf_tables_netdev nf_tables_ipv6 nf_tables_ipv4 nf_tables_inet nf_tables_bridge nf_tables nf_socket_ipv6 nf_socket_ipv4 nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_rtsp nf_nat_redirect nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4 nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_log_ipv4 nf_dup_netdev nf_defrag_ipv4 nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sip nf_conntrack_rtsp nf_conntrack_rtcache nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_irc
<4>[22759.266261]  nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_broadcast ts_kmp nf_conntrack_amanda macvlan iptable_raw iptable_mangle iptable_filter ipt_ah ipt_ECN ipheth ip_tables crc_itu_t crc_ccitt compat_xtables compat cdc_wdm br_netfilter natflow natcap fuse sch_cake tcp_bbr sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred ledtrig_usbport xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6t_NPT ip6t_MASQUERADE
<4>[22759.409326]  nf_nat_masquerade_ipv6 nf_nat nf_conntrack ip6t_rt ip6t_mh ip6t_ipv6header ip6t_hbh ip6t_frag ip6t_eui64 ip6t_ah nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 msdos ip6_gre ip_gre gre ifb sit l2tp_netlink l2tp_core udp_tunnel ip6_udp_tunnel ip6_tunnel tunnel6 tunnel4 ip_tunnel tun vfat fat autofs4 nls_utf8 nls_iso8859_1 nls_cp437 sha256_generic sha1_generic seqiv jitterentropy_rng drbg hmac ghash_generic gf128mul gcm ecb ctr cmac arc4 uas usb_storage leds_gpio xhci_plat_hcd xhci_pci xhci_mtk xhci_hcd ohci_platform ohci_hcd softdog ehci_platform sd_mod scsi_mod ehci_hcd gpio_button_hotplug ext4 mbcache jbd2 exfat usbcore nls_base usb_common mii crc32c_generic
<4>[22759.538114] Process swapper/1 (pid: 0, threadinfo=87c54000, task=87c2c240, tls=00000000)
<4>[22759.554203] Stack : 00010000 80770000 00000001 805d0000 805d29d4 87ccb6c0 00000000 01080020
<4>[22759.570853]         80370780 8036cf60 805d0000 80287a6c 807427b4 00000084 805c5ccc 87ccb6c0
<4>[22759.587501]         87ccb6c0 00000000 00000740 00000018 00000000 80370780 805c5c60 805c5ccc
<4>[22759.604148]         805c5c60 8007de2c 805d21fc 867fcc00 87ccb6c0 867b7a92 87ccb6c0 00000018
<4>[22759.620798]         00000000 867fcc00 00000000 86cb2944 85378c80 86dbbadc 86b7b000 8028784c
<4>[22759.637446]         ...
<4>[22759.642311] Call Trace:
<4>[22759.647181] [<80131830>] __kmalloc_track_caller+0x20c/0x290
<4>[22759.658277] [<8036cf60>] __kmalloc_reserve.isra.49+0x44/0xac
<4>[22759.669545] [<80370780>] pskb_expand_head+0x8c/0x328
<4>[22759.679580] [<86cb2944>] ieee80211_skb_resize+0x19c/0xde8 [mac80211]
<4>[22759.692322] [<86cb3274>] ieee80211_skb_resize+0xacc/0xde8 [mac80211]
<4>[22759.705016] Code: 00000000  8e020014  00e23821 <8ce20000> 10000012  cc400000  10400005  00000000  8e060010 
<4>[22759.724444] 
<4>[22759.727876] ---[ end trace ed050b1e5219c1b2 ]---

===================================

crash log sample2

<6>[   38.379689] device wlan0 entered promiscuous mode
<6>[   39.332245] IPv6: ADDRCONF(NETDEV_UP): wlan1-1: link is not ready
<6>[   39.373017] br-lan: port 3(wlan1-1) entered blocking state
<6>[   39.384018] br-lan: port 3(wlan1-1) entered disabled state
<6>[   39.395712] device wlan1-1 entered promiscuous mode
<6>[   41.015182] IPv6: ADDRCONF(NETDEV_UP): wlan1: link is not ready
<6>[   47.893226] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
<6>[   47.906608] br-lan: port 2(wlan0) entered blocking state
<6>[   47.917273] br-lan: port 2(wlan0) entered forwarding state
<6>[   51.254397] wlan1: authenticate with f0:b4:29:77:ff:b0
<6>[   51.934135] wlan1: send auth to f0:b4:29:77:ff:b0 (try 1/3)
<6>[   51.951705] wlan1: authenticated
<6>[   51.966985] wlan1: associate with f0:b4:29:77:ff:b0 (try 1/3)
<6>[   51.980634] wlan1: RX AssocResp from f0:b4:29:77:ff:b0 (capab=0x831 status=0 aid=2)
<6>[   51.996556] wlan1: associated
<6>[   52.189839] IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
<1>[78841.011004] CPU 1 Unable to handle kernel paging request at virtual address 07406084, epc == 86c6113c, ra == 86c6288c
<4>[78841.032159] Oops[#1]:
<4>[78841.036677] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.162 #0
<4>[78841.048793] task: 87c2c240 task.stack: 87c54000
<4>[78841.057794] $ 0   : 00000000 00000001 00000000 00000003
<4>[78841.068192] $ 4   : 07406000 84d00000 84cb7400 00000000
<4>[78841.078588] $ 8   : 00000000 00000000 84cb7400 00000000
<4>[78841.088983] $12   : 00000002 00000001 86c6ab84 00000008
<4>[78841.099380] $16   : 15c1d6d6 07409c4f 810fecaf 00000000
<4>[78841.109777] $20   : 000003ff 00000000 00000000 86702cd4
<4>[78841.120174] $24   : 00000000 832fc7f6                  
<4>[78841.130575] $28   : 87c54000 87c0bd50 00000001 86c6288c
<4>[78841.140976] Hi    : 001713f7
<4>[78841.146693] Lo    : ceed4800
<4>[78841.152586] epc   : 86c6113c minstrel_calc_rate_stats+0xaa0/0x3180 [mac80211]
<4>[78841.166807] ra    : 86c6288c minstrel_calc_rate_stats+0x21f0/0x3180 [mac80211]
<4>[78841.181163] Status: 11007c03  KERNEL EXL IE 
<4>[78841.189485] Cause : 40800008 (ExcCode 02)
<4>[78841.197448] BadVA : 07406084
<4>[78841.203162] PrId  : 0001992f (MIPS 1004Kc)
<4>[78841.211295] Modules linked in: qcserial pppoe ppp_async option l2tp_ppp cdc_mbim usb_wwan sierra_net sierra rndis_host qmi_wwan pptp pppox ppp_mppe ppp_generic nf_nat_pptp nf_conntrack_pptp mt76x2e mt76x2_common mt76x02_lib mt7603e mt76 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE huawei_cdc_ncm cfg80211 cdc_ncm cdc_ether xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_socket xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_ipp2p xt_iface xt_hl xt_helper xt_hashlimit xt_esp xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_addrtype xt_TPROXY xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_IPMARK xt_HL xt_DSCP xt_CT xt_CLASSIFY wireguard usbserial usbnet usblp ts_fsm ts_bm slhc r8152 nft_set_rbtree
<4>[78841.351720]  nft_set_hash nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject_bridge nft_reject nft_redir nft_quota nft_numgen nft_meta_bridge nft_meta nft_log nft_limit nft_fwd_netdev nft_exthdr nft_dup_netdev nft_ct nft_counter nft_chain_route_ipv6 nft_chain_route_ipv4 nf_tables_netdev nf_tables_ipv6 nf_tables_ipv4 nf_tables_inet nf_tables_bridge nf_tables nf_socket_ipv6 nf_socket_ipv4 nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_rtsp nf_nat_redirect nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4 nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_log_ipv4 nf_dup_netdev nf_defrag_ipv4 nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sip nf_conntrack_rtsp nf_conntrack_rtcache nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_irc nf_conntrack_h323
<4>[78841.492926]  nf_conntrack_ftp nf_conntrack_broadcast ts_kmp nf_conntrack_amanda macvlan iptable_raw iptable_mangle iptable_filter ipt_ah ipt_ECN ipheth ip_tables crc_itu_t crc_ccitt compat_xtables compat cdc_wdm br_netfilter natflow natcap fuse sch_cake tcp_bbr sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred ledtrig_usbport xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6t_NPT ip6t_MASQUERADE nf_nat_masquerade_ipv6
<4>[78841.636761]  nf_nat nf_conntrack ip6t_rt ip6t_mh ip6t_ipv6header ip6t_hbh ip6t_frag ip6t_eui64 ip6t_ah nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 msdos ip6_gre ip_gre gre ifb sit l2tp_netlink l2tp_core udp_tunnel ip6_udp_tunnel ip6_tunnel tunnel6 tunnel4 ip_tunnel tun vfat fat autofs4 nls_utf8 nls_iso8859_1 nls_cp437 sha256_generic sha1_generic seqiv jitterentropy_rng drbg hmac ghash_generic gf128mul gcm ecb ctr cmac arc4 uas usb_storage leds_gpio xhci_plat_hcd xhci_pci xhci_mtk xhci_hcd ohci_platform ohci_hcd softdog ehci_platform sd_mod scsi_mod ehci_hcd gpio_button_hotplug ext4 mbcache jbd2 exfat usbcore nls_base usb_common mii crc32c_generic
<4>[78841.761462] Process swapper/1 (pid: 0, threadinfo=87c54000, task=87c2c240, tls=00000000)
<4>[78841.777547] Stack : 87c0be23 84d00000 84cb7400 8669a100 00000000 84d00000 86b3364c 86b33810
<4>[78841.794179]         00000001 80286fb4 87c0be08 8669a080 00000000 86b33064 84d00000 87c0be08
<4>[78841.810808]         8669a080 86701e58 86b33064 86c1c558 00000000 00000000 00000000 00000000
<4>[78841.827436]         00000000 00000000 86b3364c 87c0be08 86700c00 86701e58 86b33000 86c03964
<4>[78841.844065]         00000701 86d5a004 86701e20 00000001 00000701 86d5a004 86701e20 00000001
<4>[78841.860693]         ...
<4>[78841.865563] Call Trace:
<4>[78841.870537] [<86c6113c>] minstrel_calc_rate_stats+0xaa0/0x3180 [mac80211]
<4>[78841.884077] Code: 14e0000b  24030003  8ca40000 <8c840084> 1483000d  2403001a  70432002  008e1021  90420002 
<4>[78841.903484] 
<4>[78841.906813] ---[ end trace 0bade8f9248ffe07 ]---

===================================
ptpt52 commented 4 years ago

@nbd168

I tracked every release of mt76, which lasted for more than 6 months, and this problem persists.  

ptpt52 commented 4 years ago

@nbd168 This problem is likely sending skb recursively, causing a stack overflow?

nbd168 commented 4 years ago

I've been chasing the same bug for a very long time. My problem is that I can't reproduce it. Any information you have on this would be helpful for tracking it down. What information makes you think it's stack overflow on sending skb?

nbd168 commented 4 years ago

Fixed in 8f8e9161b3550c9b53a70dec6ceeac0d194a221d