Closed Fail-Safe closed 1 month ago
@Fail-Safe I was experiencing something similiar with multi psk, it seems the allmulticast mode was also active.
Then I readed this post by nxhack: https://forum.openwrt.org/t/gl-inet-flint-2-gl-mt6000-discussions/173524/1086?u=xize
Now I have compiled my own version for the MT6000 on kernel 6.6 with this version of the calibration data, and it seems my setup is no longer crashing when I use my Ayaneo Geek 1S (Intel AX210), I also added my crash report in the issue you mentoided #860, it was for me mostly affected by the AX210 device for me once I used heavy p2p udp (gta online traffic), now I play this game +1 hour and the driver does not seem to crash.
Can you check if this fixes your issue aswell?, this is the relevant commit of my mt76 fork to show which files I have replaced (I noticed if I replace the other files like the eeproms I was getting a softbrick when I did it via /usr/firmware/mediatek
at runtime), I'm very interested to see if it fixes also this issue π
@xize Thank you so much for letting me know about this! I actually had grabbed those newer firmware files (for mt7986) as well and have been running with them on my kernel 6.6 build. For whatever reason, I hadn't thought to try re-enabling multicast_to_unicast_all
with the newer firmware, though.
Trying it now... will report back once I get it some time to run. :)
unfortunately I spook to soon mine still crashed another stacktrace:
[ 26.388775] br-lan: port 10(phy0-ap0-aqnet) entered forwarding state
[ 26.398053] mt798x-wmac 18000000.wifi phy0-ap0-aqnet: left allmulticast mode
[ 26.405179] mt798x-wmac 18000000.wifi phy0-ap0-aqnet: left promiscuous mode
[ 26.412299] br-lan: port 10(phy0-ap0-aqnet) entered disabled state
[ 26.451733] br-lan: port 10(phy0-ap0-aqnet) entered blocking state
[ 26.457914] br-lan: port 10(phy0-ap0-aqnet) entered disabled state
[ 26.464131] mt798x-wmac 18000000.wifi phy0-ap0-aqnet: entered allmulticast mode
[ 26.471792] mt798x-wmac 18000000.wifi phy0-ap0-aqnet: entered promiscuous mode
[ 26.480151] br-lan: port 10(phy0-ap0-aqnet) entered blocking state
[ 26.486374] br-lan: port 10(phy0-ap0-aqnet) entered forwarding state
[ 26.686587] br-lan: port 6(phy0-ap0) entered blocking state
[ 26.692187] br-lan: port 6(phy0-ap0) entered forwarding state
[ 26.698430] br-lan: port 8(phy0-ap0-zigbee) entered blocking state
[ 26.704659] br-lan: port 8(phy0-ap0-zigbee) entered forwarding state
[ 31.324517] br-lan: port 11(vx0) entered blocking state
[ 31.329752] br-lan: port 11(vx0) entered disabled state
[ 31.335099] vx0: entered allmulticast mode
[ 31.339398] vx0: entered promiscuous mode
[ 31.345514] br-lan: port 11(vx0) entered blocking state
[ 31.350772] br-lan: port 11(vx0) entered forwarding state
[ 60.608218] br-lan.169: entered allmulticast mode
[ 60.613088] br-lan: entered allmulticast mode
[ 60.617678] eth1.300: entered allmulticast mode
[ 60.622288] mtk_soc_eth 15100000.ethernet eth1: entered allmulticast mode
[ 105.683150] br-lan: port 12(phy1-ap0-aya) entered blocking state
[ 105.689200] br-lan: port 12(phy1-ap0-aya) entered disabled state
[ 105.695243] mt798x-wmac 18000000.wifi phy1-ap0-aya: entered allmulticast mode
[ 105.702578] mt798x-wmac 18000000.wifi phy1-ap0-aya: entered promiscuous mode
[ 105.712447] mt798x-wmac 18000000.wifi phy1-ap0-aya: left allmulticast mode
[ 105.719365] mt798x-wmac 18000000.wifi phy1-ap0-aya: left promiscuous mode
[ 105.726189] br-lan: port 12(phy1-ap0-aya) entered disabled state
[ 105.780359] br-lan: port 12(phy1-ap0-aya) entered blocking state
[ 105.786363] br-lan: port 12(phy1-ap0-aya) entered disabled state
[ 105.792409] mt798x-wmac 18000000.wifi phy1-ap0-aya: entered allmulticast mode
[ 105.799693] mt798x-wmac 18000000.wifi phy1-ap0-aya: entered promiscuous mode
[ 105.995558] br-lan: port 7(phy1-ap0) entered blocking state
[ 106.001163] br-lan: port 7(phy1-ap0) entered forwarding state
[ 106.007126] br-lan: port 12(phy1-ap0-aya) entered blocking state
[ 106.013139] br-lan: port 12(phy1-ap0-aya) entered forwarding state
[12450.669959] mt798x-wmac 18000000.wifi: Message 000026ed (seq 9) timeout
[12471.127256] mt798x-wmac 18000000.wifi: Message 00005aed (seq 10) timeout
[12491.585889] mt798x-wmac 18000000.wifi: Message 000026ed (seq 11) timeout
[12491.592849] mt798x-wmac 18000000.wifi: Message 000025ed (seq 12) timeout
[12491.599607] ------------[ cut here ]------------
[12491.604206] WARNING: CPU: 0 PID: 18242 at ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[12491.613021] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_inet wireguard pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_compat nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack_netlink nf_conntrack mt7915e(O) mt76_connac_lib(O) mt76(O) mac80211(O) libchacha20poly1305 iptable_mangle iptable_filter ipt_REJECT ipt_ECN ip_tables chacha_neon cfg80211(O) xt_time xt_tcpudp xt_tcpmss xt_statistic xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_comment xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY x_tables slhc sch_cake poly1305_neon nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcurve25519_generic libcrc32c libchacha compat(O) crypto_safexcel sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact
[12491.613176] ip6_gre ip_gre gre ifb ip6_tunnel tunnel6 ip_tunnel vxlan udp_tunnel ip6_udp_tunnel sha512_arm64 sha1_ce sha1_generic seqiv md5 geniv des_generic libdes authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd gpio_button_hotplug(O) usbcore usb_common aquantia
[12491.728150] CPU: 0 PID: 18242 Comm: kworker/u8:2 Tainted: G O 6.6.27 #0
[12491.736131] Hardware name: GL.iNet GL-MT6000 (DT)
[12491.740818] Workqueue: phy1 ieee80211_ba_session_work [mac80211]
[12491.746829] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[12491.753770] pc : ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[12491.760385] lr : ___ieee80211_stop_tx_ba_session+0x1d4/0x2f4 [mac80211]
[12491.767001] sp : ffffffc08a823c80
[12491.770298] x29: ffffffc08a823c80 x28: 0000000000000001 x27: ffffff800d77e6c0
[12491.777415] x26: ffffff800a3c23b8 x25: ffffff80076008a0 x24: ffffff80076008a0
[12491.784530] x23: ffffffc07906bd10 x22: ffffff800a3c40e8 x21: 0000000000000001
[12491.791646] x20: ffffff800d77e6c0 x19: ffffff800a3c2000 x18: 000000000000017c
[12491.798761] x17: 0000000000000000 x16: 0000000000000078 x15: ffffffc080b5a128
[12491.805876] x14: 0000000000000474 x13: 000000000000017c x12: 00000000ffffffea
[12491.812992] x11: 0000000000000040 x10: ffffffc080b57470 x9 : ffffffc080b57468
[12491.820107] x8 : ffffff8000403dc0 x7 : 0000000000000000 x6 : 0000001aa2970ad3
[12491.827222] x5 : 0000000001000000 x4 : 0000000000000000 x3 : 0000000000000000
[12491.834337] x2 : 0000000000000001 x1 : 0000000000000002 x0 : 00000000ffffff92
[12491.841453] Call trace:
[12491.843885] ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[12491.850155] ieee80211_ba_session_work+0x418/0x444 [mac80211]
[12491.855904] process_one_work+0x154/0x2a0
[12491.859902] worker_thread+0x2a8/0x484
[12491.863636] kthread+0xdc/0xe8
[12491.866679] ret_from_fork+0x10/0x20
[12491.870241] ---[ end trace 0000000000000000 ]---
[12896.914708] mt798x-wmac 18000000.wifi: Message 000026ed (seq 4) timeout
[12917.372441] mt798x-wmac 18000000.wifi: Message 00005aed (seq 5) timeout
[12937.831918] mt798x-wmac 18000000.wifi: Message 000026ed (seq 6) timeout
[12937.838606] ------------[ cut here ]------------
[12937.843214] WARNING: CPU: 3 PID: 17543 at ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[12937.852030] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_inet wireguard pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_compat nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack_netlink nf_conntrack mt7915e(O) mt76_connac_lib(O) mt76(O) mac80211(O) libchacha20poly1305 iptable_mangle iptable_filter ipt_REJECT ipt_ECN ip_tables chacha_neon cfg80211(O) xt_time xt_tcpudp xt_tcpmss xt_statistic xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_comment xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY x_tables slhc sch_cake poly1305_neon nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcurve25519_generic libcrc32c libchacha compat(O) crypto_safexcel sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact
[12937.852192] ip6_gre ip_gre gre ifb ip6_tunnel tunnel6 ip_tunnel vxlan udp_tunnel ip6_udp_tunnel sha512_arm64 sha1_ce sha1_generic seqiv md5 geniv des_generic libdes authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd gpio_button_hotplug(O) usbcore usb_common aquantia
[12937.967165] CPU: 3 PID: 17543 Comm: kworker/u8:4 Tainted: G W O 6.6.27 #0
[12937.975147] Hardware name: GL.iNet GL-MT6000 (DT)
[12937.979834] Workqueue: phy1 ieee80211_ba_session_work [mac80211]
[12937.985852] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[12937.992793] pc : ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[12937.999409] lr : ___ieee80211_stop_tx_ba_session+0x1d4/0x2f4 [mac80211]
[12938.006024] sp : ffffffc08a7abc80
[12938.009323] x29: ffffffc08a7abc80 x28: 0000000000000001 x27: ffffff800d77e240
[12938.016439] x26: ffffff800aec83b8 x25: ffffff80076008a0 x24: ffffff80076008a0
[12938.023554] x23: ffffffc07906bd10 x22: ffffff800aece0e8 x21: 0000000000000001
[12938.030670] x20: ffffff800d77e240 x19: ffffff800aec8000 x18: ffffff800aece000
[12938.037786] x17: 0000000000000001 x16: 00000000000021c0 x15: ffffff80076008a6
[12938.044902] x14: 0000000000000028 x13: fffffffffffff778 x12: 0000000000000002
[12938.052017] x11: 0000000000000040 x10: ffffffc080b57470 x9 : ffffffc080b57468
[12938.059134] x8 : 0000000000000002 x7 : 000000000000b737 x6 : 0000001aa2970ad3
[12938.066249] x5 : 0000000001000000 x4 : 0000000000000000 x3 : 0000000000000000
[12938.073364] x2 : 0000000000000001 x1 : 0000000000000002 x0 : 00000000fffffff4
[12938.080481] Call trace:
[12938.082913] ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[12938.089185] ieee80211_ba_session_work+0x418/0x444 [mac80211]
[12938.094934] process_one_work+0x154/0x2a0
[12938.098933] worker_thread+0x2a8/0x484
[12938.102667] kthread+0xdc/0xe8
[12938.105708] ret_from_fork+0x10/0x20
[12938.109270] ---[ end trace 0000000000000000 ]---
I just started noticing the 000026ed
and 00005aed
timeouts again myself:
[ 743.168242] mt798x-wmac 18000000.wifi: Message 00005aed (seq 2) timeout
[ 1297.661339] mt798x-wmac 18000000.wifi: Message 00005aed (seq 13) timeout
[ 1842.429689] mt798x-wmac 18000000.wifi: Message 000026ed (seq 15) timeout
[ 1862.888429] mt798x-wmac 18000000.wifi: Message 00005aed (seq 1) timeout
[ 1883.346054] mt798x-wmac 18000000.wifi: Message 000800c4 (seq 2) timeout
[ 1903.804285] mt798x-wmac 18000000.wifi: Message 000026ed (seq 3) timeout
[ 1924.263044] mt798x-wmac 18000000.wifi: Message 00005aed (seq 4) timeout
[ 1944.720388] mt798x-wmac 18000000.wifi: Message 000026ed (seq 5) timeout
[ 1965.179029] mt798x-wmac 18000000.wifi: Message 00005aed (seq 6) timeout
[ 1985.637032] mt798x-wmac 18000000.wifi: Message 000026ed (seq 7) timeout
[ 2006.094603] mt798x-wmac 18000000.wifi: Message 00005aed (seq 8) timeout
[ 2026.553350] mt798x-wmac 18000000.wifi: Message 000026ed (seq 9) timeout
I can confirm aswell with 100% certainty it is multicast, yesterday I added multicast_to_unicast='0'
to the br-lan bridge and that fixed the strange crash my Ayaneo was creating connected to my multi psk setup.
Now i'm also monitoring a other device: the Mi Smart Clock, it seems this device does not cause a crash or a time out inside the OpenWrt logs, but it seem to get a artifacting/unresponsive touch screen after a while being up, if it keeps responsive with this change my guesses point to maybe these things:
any suggestions which commands I can try to check if they are indeed invalidated/corrupt multicast packets or flooding? that would be surely helpfull π
@nbd168 Is this multicast_to_unicast_all
issue correctly homed in this mt76 project? Or is there another project where I should create this issue to get proper visibility?
Thanks!
so I have tried applying this patch from @blocktrron to see if it makes any changes from here.
interesting my stacktrace shows a little bit more:
does this give perhaps some more input where it could crash?
Looks like mcu does not like messages 26 and 5A sent together. https://github.com/rany2/openwrt/commit/18cc739263004d4846991c9afbc6ba45c39293a1 should solve the issue.
Looks like mcu does not like messages 26 and 5A sent together. rany2/openwrt@18cc739 should solve the issue.
@rany2
checked this patch again with a dirclean, though the patch still crash but less than what I was having without this patch I could be mistaken since the randomness it happens, a new time out message appeared though mt798x-wmac 18000000.wifi: Message 000800c4 (seq 7) timeout
new stacktrace:
Looks like mcu does not like messages 26 and 5A sent together. rany2/openwrt@18cc739 should solve the issue.
@rany2
checked this patch again with a dirclean, though the patch still crash but less than what I was having without this patch I could be mistaken since the randomness it happens, a new time out message appeared though
mt798x-wmac 18000000.wifi: Message 000800c4 (seq 7) timeout
new stacktrace:
Please apply this patch too: https://[pastebin.com/raw/cyn8YQ4R](https://pastebin.com/raw/cyn8YQ4R)
@xize I had look at your tree and you didn't apply my patch properly. I'm not sure sure why you did it like that: https://github.com/xize/openwrt-flint2-testing/commit/1c16923264f59d7dbe5be1d1fb5f609a6431cd52
The patch file @lukasz1992 linked to is already a patch file, so just download that file to your tree in mt76 patches folder. It should be against mt76 not your openwrt tree.
Please apply this patch too: https://[pastebin.com/raw/cyn8YQ4R](https://pastebin.com/raw/cyn8YQ4R)
@lukasz1992 && @rany2
I have applied this patch and encountered the following error after re-enabling multicast_to_unicast_all
:
root@AP:~# cat /sys/fs/pstore/dmesg-ramoops-0
Oops#1 Part1
<6>[ 22.439909] mt798x-wmac 18000000.wifi phy0-ap2: entered promiscuous mode
<6>[ 22.448851] br-lan: port 10(phy0-ap2) entered blocking state
<6>[ 22.454538] br-lan: port 10(phy0-ap2) entered forwarding state
<6>[ 23.246394] br-lan: port 7(phy1-ap0) entered blocking state
<6>[ 23.251996] br-lan: port 7(phy1-ap0) entered forwarding state
<6>[ 23.371730] br-lan: port 11(phy1-ap1) entered blocking state
<6>[ 23.377404] br-lan: port 11(phy1-ap1) entered disabled state
<6>[ 23.383131] mt798x-wmac 18000000.wifi phy1-ap1: entered allmulticast mode
<6>[ 23.390190] mt798x-wmac 18000000.wifi phy1-ap1: entered promiscuous mode
<6>[ 23.400229] br-lan: port 11(phy1-ap1) entered blocking state
<6>[ 23.405884] br-lan: port 11(phy1-ap1) entered forwarding state
<6>[ 23.413659] mt798x-wmac 18000000.wifi phy1-ap1: left allmulticast mode
<6>[ 23.420233] mt798x-wmac 18000000.wifi phy1-ap1: left promiscuous mode
<6>[ 23.426771] br-lan: port 11(phy1-ap1) entered disabled state
<6>[ 23.501383] br-lan: port 11(phy1-ap1) entered blocking state
<6>[ 23.507043] br-lan: port 11(phy1-ap1) entered disabled state
<6>[ 23.512736] mt798x-wmac 18000000.wifi phy1-ap1: entered allmulticast mode
<6>[ 23.519686] mt798x-wmac 18000000.wifi phy1-ap1: entered promiscuous mode
<6>[ 23.526484] br-lan: port 11(phy1-ap1) entered blocking state
<6>[ 23.532137] br-lan: port 11(phy1-ap1) entered forwarding state
<3>[ 105.908186] mt798x-wmac 18000000.wifi phy0-ap2: failed (err=-2) to del object (id=3)
<3>[ 105.915931] mt798x-wmac 18000000.wifi phy1-ap1: failed (err=-2) to del object (id=3)
<6>[ 217.633996] br-lan: port 11(phy1-ap1) entered disabled state
<1>[ 217.702985] Unable to handle kernel paging request at virtual address 9ae1ed4c37e60181
<1>[ 217.710908] Mem abort info:
<1>[ 217.713727] ESR = 0x0000000096000004
<1>[ 217.717460] EC = 0x25: DABT (current EL), IL = 32 bits
<1>[ 217.722762] SET = 0, FnV = 0
<1>[ 217.725802] EA = 0, S1PTW = 0
<1>[ 217.728926] FSC = 0x04: level 0 translation fault
<1>[ 217.733808] Data abort info:
<1>[ 217.736754] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
<1>[ 217.742223] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
<1>[ 217.747267] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
<1>[ 217.752560] [9ae1ed4c37e60181] address between user and kernel address ranges
<0>[ 217.759684] Internal error: Oops: 0000000096000004 [#1] SMP
<7>[ 217.765238] Modules linked in: nft_fib_inet nf_flow_table_inet iptable_nat xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7915e(O) mt76_connac_lib(O) mt76(O) mac80211(O) iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211(O) xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables tcp_bbr nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c compat(O) cls_flower act_vlan crypto_safexcel cls_bpf act_bpf sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact sha512_arm64 sha1_ce sha1_generic seqiv md5 geniv des_generic libdes authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd
<7>[ 217.765408] gpio_button_hotplug(O) usbcore usb_common aquantia
<7>[ 217.860492] CPU: 3 PID: 1665 Comm: hostapd Tainted: G O 6.6.28 #0
<7>[ 217.867954] Hardware name: GL.iNet GL-MT6000 (DT)
<7>[ 217.872640] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
<7>[ 217.879580] pc : mtk_wed_setup_tc_block_cb+0x4/0x38
<7>[ 217.884449] lr : tc_setup_cb_reoffload+0x30/0x134
<7>[ 217.889140] sp : ffffffc0814c3360
<7>[ 217.892438] x29: ffffffc0814c3360 x28: ffffffc080b48000 x27: 0000000000000000
<7>[ 217.899554] x26: ffffff800616b000 x25: 0000000000000000 x24: 0000000000000000
<7>[ 217.906669] x23: ffffff8006454840 x22: ffffff800616b000 x21: ffffff800a6fb3ec
<7>[ 217.913784] x20: 0000000000000000 x19: ffffff800cabb280 x18: 0000000000000028
<7>[ 217.920900] x17: 0000000000000000 x16: 0000000000001978 x15: 0000000000000a30
<7>[ 217.928014] x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000002
<7>[ 217.935130] x11: 0000000000000303 x10: 00000000000008a0 x9 : ffffffc0814c35f0
<7>[ 217.942246] x8 : 0000000000000001 x7 : ffffff800a6fb3ec x6 : ffffff8006454840
<7>[ 217.949361] x5 : ffffffc0814c3408 x4 : 9ae1ed4c37e60091 x3 : ffffff8001350e40
<7>[ 217.956477] x2 : ffffff8006454840 x1 : ffffffc0814c3408 x0 : 0000000000000005
<7>[ 217.963593] Call trace:
<7>[ 217.966025] mtk_wed_setup_tc_block_cb+0x4/0x38
<7>[ 217.970539] 0xffffffc078e054ac
<7>[ 217.973700] tcf_block_playback_offloads+0x70/0x1e8
<7>[ 217.978562] tcf_block_unbind+0x6c/0xc8
<7>[ 217.982384] tcf_block_setup+0x38/0x1e4
<7>[ 217.986205] tcf_block_offload_cmd.isra.0+0xdc/0x128
<7>[ 217.991152] tcf_block_offload_unbind+0x50/0x8c
<7>[ 217.995667] __tcf_block_put+0x88/0x17c
<7>[ 217.999488] tcf_block_put_ext+0x4c/0x60
<7>[ 218.003395] 0xffffffc078de49ac
<7>[ 218.006532] __qdisc_destroy+0x40/0xa0
<7>[ 218.010266] qdisc_put+0x54/0x6c
<7>[ 218.013479] dev_shutdown+0x90/0x108
<7>[ 218.017038] unregister_netdevice_many_notify+0x1cc/0x788
<7>[ 218.022422] unregister_netdevice_queue+0xa4/0xb0
<7>[ 218.027111] cfg80211_shutdown_all_interfaces+0x32c/0x37c [cfg80211]
<7>[ 218.033465] cfg80211_unregister_wdev+0x10/0x18 [cfg80211]
<7>[ 218.038946] ieee80211_if_remove+0x6c/0x110 [mac80211]
<7>[ 218.044102] ieee80211_channel_switch_disconnect+0x1cfc/0x1d08 [mac80211]
<7>[ 218.050891] cfg80211_remove_virtual_intf+0x5c/0x68 [cfg80211]
<7>[ 218.056720] cfg80211_check_station_change+0x31ac/0x32c4 [cfg80211]
<7>[ 218.062981] genl_family_rcv_msg_doit+0xa8/0x108
<7>[ 218.067584] genl_rcv_msg+0x1b0/0x244
<7>[ 218.071231] netlink_rcv_skb+0x54/0x11c
<7>[ 218.075051] genl_rcv+0x34/0x48
<7>[ 218.078178] netlink_unicast+0x1e0/0x2c8
<7>[ 218.082085] netlink_sendmsg+0x198/0x3c4
<7>[ 218.085992] ____sys_sendmsg+0x1bc/0x26c
<7>[ 218.089905] ___sys_sendmsg+0x78/0xb8
<7>[ 218.093552] __sys_sendmsg+0x44/0x98
<7>[ 218.097111] __arm64_sys_sendmsg+0x20/0x28
<7>[ 218.101192] invoke_syscall.constprop.0+0x4c/0xe0
<7>[ 218.105882] do_el0_svc+0x3c/0xbc
<7>[ 218.109182] el0_svc+0x18/0x4c
<7>[ 218.112225] el0t_64_sync_handler+0x118/0x124
<7>[ 218.116567] el0t_64_sync+0x150/0x154
<0>[ 218.120218] Code: b9401fe0 a8c27bfd d65f03c0 a9401043 (f9407882)
<4>[ 218.126291] ---[ end trace 0000000000000000 ]---
root@AP:~# cat /sys/fs/pstore/dmesg-ramoops-1
Panic#2 Part1
<6>[ 23.251996] br-lan: port 7(phy1-ap0) entered forwarding state
<6>[ 23.371730] br-lan: port 11(phy1-ap1) entered blocking state
<6>[ 23.377404] br-lan: port 11(phy1-ap1) entered disabled state
<6>[ 23.383131] mt798x-wmac 18000000.wifi phy1-ap1: entered allmulticast mode
<6>[ 23.390190] mt798x-wmac 18000000.wifi phy1-ap1: entered promiscuous mode
<6>[ 23.400229] br-lan: port 11(phy1-ap1) entered blocking state
<6>[ 23.405884] br-lan: port 11(phy1-ap1) entered forwarding state
<6>[ 23.413659] mt798x-wmac 18000000.wifi phy1-ap1: left allmulticast mode
<6>[ 23.420233] mt798x-wmac 18000000.wifi phy1-ap1: left promiscuous mode
<6>[ 23.426771] br-lan: port 11(phy1-ap1) entered disabled state
<6>[ 23.501383] br-lan: port 11(phy1-ap1) entered blocking state
<6>[ 23.507043] br-lan: port 11(phy1-ap1) entered disabled state
<6>[ 23.512736] mt798x-wmac 18000000.wifi phy1-ap1: entered allmulticast mode
<6>[ 23.519686] mt798x-wmac 18000000.wifi phy1-ap1: entered promiscuous mode
<6>[ 23.526484] br-lan: port 11(phy1-ap1) entered blocking state
<6>[ 23.532137] br-lan: port 11(phy1-ap1) entered forwarding state
<3>[ 105.908186] mt798x-wmac 18000000.wifi phy0-ap2: failed (err=-2) to del object (id=3)
<3>[ 105.915931] mt798x-wmac 18000000.wifi phy1-ap1: failed (err=-2) to del object (id=3)
<6>[ 217.633996] br-lan: port 11(phy1-ap1) entered disabled state
<1>[ 217.702985] Unable to handle kernel paging request at virtual address 9ae1ed4c37e60181
<1>[ 217.710908] Mem abort info:
<1>[ 217.713727] ESR = 0x0000000096000004
<1>[ 217.717460] EC = 0x25: DABT (current EL), IL = 32 bits
<1>[ 217.722762] SET = 0, FnV = 0
<1>[ 217.725802] EA = 0, S1PTW = 0
<1>[ 217.728926] FSC = 0x04: level 0 translation fault
<1>[ 217.733808] Data abort info:
<1>[ 217.736754] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
<1>[ 217.742223] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
<1>[ 217.747267] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
<1>[ 217.752560] [9ae1ed4c37e60181] address between user and kernel address ranges
<0>[ 217.759684] Internal error: Oops: 0000000096000004 [#1] SMP
<7>[ 217.765238] Modules linked in: nft_fib_inet nf_flow_table_inet iptable_nat xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7915e(O) mt76_connac_lib(O) mt76(O) mac80211(O) iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211(O) xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables tcp_bbr nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c compat(O) cls_flower act_vlan crypto_safexcel cls_bpf act_bpf sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact sha512_arm64 sha1_ce sha1_generic seqiv md5 geniv des_generic libdes authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd
<7>[ 217.765408] gpio_button_hotplug(O) usbcore usb_common aquantia
<7>[ 217.860492] CPU: 3 PID: 1665 Comm: hostapd Tainted: G O 6.6.28 #0
<7>[ 217.867954] Hardware name: GL.iNet GL-MT6000 (DT)
<7>[ 217.872640] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
<7>[ 217.879580] pc : mtk_wed_setup_tc_block_cb+0x4/0x38
<7>[ 217.884449] lr : tc_setup_cb_reoffload+0x30/0x134
<7>[ 217.889140] sp : ffffffc0814c3360
<7>[ 217.892438] x29: ffffffc0814c3360 x28: ffffffc080b48000 x27: 0000000000000000
<7>[ 217.899554] x26: ffffff800616b000 x25: 0000000000000000 x24: 0000000000000000
<7>[ 217.906669] x23: ffffff8006454840 x22: ffffff800616b000 x21: ffffff800a6fb3ec
<7>[ 217.913784] x20: 0000000000000000 x19: ffffff800cabb280 x18: 0000000000000028
<7>[ 217.920900] x17: 0000000000000000 x16: 0000000000001978 x15: 0000000000000a30
<7>[ 217.928014] x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000002
<7>[ 217.935130] x11: 0000000000000303 x10: 00000000000008a0 x9 : ffffffc0814c35f0
<7>[ 217.942246] x8 : 0000000000000001 x7 : ffffff800a6fb3ec x6 : ffffff8006454840
<7>[ 217.949361] x5 : ffffffc0814c3408 x4 : 9ae1ed4c37e60091 x3 : ffffff8001350e40
<7>[ 217.956477] x2 : ffffff8006454840 x1 : ffffffc0814c3408 x0 : 0000000000000005
<7>[ 217.963593] Call trace:
<7>[ 217.966025] mtk_wed_setup_tc_block_cb+0x4/0x38
<7>[ 217.970539] 0xffffffc078e054ac
<7>[ 217.973700] tcf_block_playback_offloads+0x70/0x1e8
<7>[ 217.978562] tcf_block_unbind+0x6c/0xc8
<7>[ 217.982384] tcf_block_setup+0x38/0x1e4
<7>[ 217.986205] tcf_block_offload_cmd.isra.0+0xdc/0x128
<7>[ 217.991152] tcf_block_offload_unbind+0x50/0x8c
<7>[ 217.995667] __tcf_block_put+0x88/0x17c
<7>[ 217.999488] tcf_block_put_ext+0x4c/0x60
<7>[ 218.003395] 0xffffffc078de49ac
<7>[ 218.006532] __qdisc_destroy+0x40/0xa0
<7>[ 218.010266] qdisc_put+0x54/0x6c
<7>[ 218.013479] dev_shutdown+0x90/0x108
<7>[ 218.017038] unregister_netdevice_many_notify+0x1cc/0x788
<7>[ 218.022422] unregister_netdevice_queue+0xa4/0xb0
<7>[ 218.027111] cfg80211_shutdown_all_interfaces+0x32c/0x37c [cfg80211]
<7>[ 218.033465] cfg80211_unregister_wdev+0x10/0x18 [cfg80211]
<7>[ 218.038946] ieee80211_if_remove+0x6c/0x110 [mac80211]
<7>[ 218.044102] ieee80211_channel_switch_disconnect+0x1cfc/0x1d08 [mac80211]
<7>[ 218.050891] cfg80211_remove_virtual_intf+0x5c/0x68 [cfg80211]
<7>[ 218.056720] cfg80211_check_station_change+0x31ac/0x32c4 [cfg80211]
<7>[ 218.062981] genl_family_rcv_msg_doit+0xa8/0x108
<7>[ 218.067584] genl_rcv_msg+0x1b0/0x244
<7>[ 218.071231] netlink_rcv_skb+0x54/0x11c
<7>[ 218.075051] genl_rcv+0x34/0x48
<7>[ 218.078178] netlink_unicast+0x1e0/0x2c8
<7>[ 218.082085] netlink_sendmsg+0x198/0x3c4
<7>[ 218.085992] ____sys_sendmsg+0x1bc/0x26c
<7>[ 218.089905] ___sys_sendmsg+0x78/0xb8
<7>[ 218.093552] __sys_sendmsg+0x44/0x98
<7>[ 218.097111] __arm64_sys_sendmsg+0x20/0x28
<7>[ 218.101192] invoke_syscall.constprop.0+0x4c/0xe0
<7>[ 218.105882] do_el0_svc+0x3c/0xbc
<7>[ 218.109182] el0_svc+0x18/0x4c
<7>[ 218.112225] el0t_64_sync_handler+0x118/0x124
<7>[ 218.116567] el0t_64_sync+0x150/0x154
<0>[ 218.120218] Code: b9401fe0 a8c27bfd d65f03c0 a9401043 (f9407882)
<4>[ 218.126291] ---[ end trace 0000000000000000 ]---
<3>[ 218.134291] pstore: backend (ramoops) writing error (-28)
<0>[ 218.139676] Kernel panic - not syncing: Oops: Fatal exception
<2>[ 218.145402] SMP: stopping secondary CPUs
<0>[ 218.149310] Kernel Offset: disabled
<0>[ 218.152782] CPU features: 0x0,00000000,00000000,1000400b
<0>[ 218.158075] Memory Limit: none
Was there another patch that I missed?
Update: Grabbing this patch to include in my build now... https://github.com/rany2/openwrt/commit/18cc739263004d4846991c9afbc6ba45c39293a1
@rany2 hmm im very new to patching i decided to use quilt now but I do get this error from the patch:
Applying patch 9004-wifi-mt76-mt7915-do-not-use-event-format-to-get-.patch
patching file mt76_connac_mcu.h
Hunk #1 FAILED at 1216.
1 out of 1 hunk FAILED -- rejects in file mt76_connac_mcu.h
patching file mt7915/init.c
Hunk #1 succeeded at 515 (offset 21 lines).
patching file mt7915/mac.c
Hunk #1 succeeded at 1146 (offset -64 lines).
Hunk #2 succeeded at 1256 (offset -23 lines).
patching file mt7915/mcu.c
Hunk #1 succeeded at 3086 (offset -261 lines).
patching file mt7915/mcu.h
Hunk #1 succeeded at 163 (offset -100 lines).
patching file mt7915/mt7915.h
Hunk #1 succeeded at 495 with fuzz 1 (offset -175 lines).
patching file mt7915/regs.h
Hunk #1 succeeded at 311 (offset -12 lines).
Hunk #2 succeeded at 415 (offset -12 lines).
Hunk #3 succeeded at 564 (offset -19 lines).
Hunk #4 succeeded at 579 (offset -19 lines).
Patch 9004-wifi-mt76-mt7915-do-not-use-event-format-to-get-.patch does not apply (enforce with -f)
make[2]: *** [Makefile:636: /home/xize/openwrt-flint2-testing/build_dir/target-aarch64_cortex-a53_musl/linux-mediatek_filogic/mt76-2024.04.03~1e336a85/.quilt_checked] Error 1
some guidance would be excellent, or do i need to use this patch somewhere else like in the mt76 itself?, maybe it is because I use kernel 6.6 in my builds.
@xize This patch should apply cleanly for the regular mt76 repository, it didn't apply cleanly for you because of other patches in my repo that were built on top of this:
```diff
From e5b4c0323eb5575a4531d3967d12fb3ba6d835ea Mon Sep 17 00:00:00 2001
From: rany
Was there another patch that I missed?
Update: Grabbing this patch to include in my build now... rany2/openwrt@18cc739
I had to rebuild the patch to get it to apply cleanly in my build, but ended up with a clean build and so far things are looking a lot better. I'm not getting any 00005aed
or 000026ed
timeouts now with multicast_to_unicast_all
enabled.
Will continue to let this cook and monitor it for a while to see if things hold up.
Sheesh... spoke too soon. Just hit this crash:
Wed May 1 13:56:36 2024 kern.err kernel: [ 9466.996682] mt798x-wmac 18000000.wifi: Message 000026ed (seq 14) timeout
Wed May 1 13:56:56 2024 kern.err kernel: [ 9487.444333] mt798x-wmac 18000000.wifi: Message 00002ced (seq 15) timeout
Wed May 1 13:57:16 2024 kern.err kernel: [ 9507.912542] mt798x-wmac 18000000.wifi: Message 000800c4 (seq 1) timeout
...
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.337131] ------------[ cut here ]------------
Wed May 1 13:58:03 2024 kern.warn kernel: [ 9554.341756] WARNING: CPU: 0 PID: 9731 at kthread_park+0x9c/0xb0
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.347667] Modules linked in: nft_fib_inet nf_flow_table_inet iptable_nat xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7915e(O) mt76_connac_lib(O) mt76(O) mac80211(O) iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211(O) xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables tcp_bbr nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c compat(O) cls_flower act_vlan crypto_safexcel cls_bpf act_bpf sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact sha512_arm64 sha1_ce sha1_generic seqiv md5 geniv des_generic libdes authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.347844] gpio_button_hotplug(O) usbcore usb_common aquantia
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.442932] CPU: 0 PID: 9731 Comm: kworker/u8:0 Tainted: G O 6.6.28 #0
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.450826] Hardware name: GL.iNet GL-MT6000 (DT)
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.455513] Workqueue: mt76 mt7915_mac_reset_work [mt7915e]
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.461098] pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.468038] pc : kthread_park+0x9c/0xb0
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.471862] lr : mt7915_mac_reset_work+0x128/0xd28 [mt7915e]
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.477511] sp : ffffffc082d03ca0
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.480809] x29: ffffffc082d03ca0 x28: 0000000000000000 x27: ffffff800640fa20
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.487927] x26: ffffff800707a680 x25: ffffff800707a000 x24: ffffff8006402000
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.495044] x23: ffffffffffff25e0 x22: ffffff8000011000 x21: ffffff80008bc400
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.502161] x20: ffffff8001014f80 x19: ffffff8000d82e00 x18: 0000000000000000
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.509277] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.516392] x14: 0000000000000000 x13: 0000000000000020 x12: 0101010101010101
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.523508] x11: 0000000000000040 x10: ffffffc080b57470 x9 : ffffffc080b57468
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.530623] x8 : ffffffffffff6400 x7 : 0000000000000000 x6 : 0000000000000000
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.537739] x5 : ffffff8000401918 x4 : ffffff8000401980 x3 : 0000000000000000
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.544855] x2 : 0000000000000001 x1 : ffffffc080b57488 x0 : 0000000000000004
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.551970] Call trace:
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.554403] kthread_park+0x9c/0xb0
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.557881] mt7915_mac_reset_work+0x128/0xd28 [mt7915e]
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.563189] process_one_work+0x154/0x2a0
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.567185] worker_thread+0x2ac/0x48c
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.570919] kthread+0xdc/0xe8
Wed May 1 13:58:03 2024 kern.debug kernel: [ 9554.573961] ret_from_fork+0x10/0x20
Wed May 1 13:58:03 2024 kern.warn kernel: [ 9554.577522] ---[ end trace 0000000000000000 ]---
To be clear, I am building with the following:
bridger
+ WED enabledmulticast_to_unicast_all
enabled[ 11.705644] platform 15010000.wed: MTK WED WO Firmware Version: DEV_000000, Build Time: 20240411200554
[ 12.405383] mt798x-wmac 18000000.wifi: WM Firmware Version: ____000000, Build Time: 20240411200401
[ 12.489804] mt798x-wmac 18000000.wifi: WA Firmware Version: DEV_000000, Build Time: 20240411200542
I can confirm aswell mine crashed too, it did took alot of time though.
I use kernel 6.6.29 with the patch from @lukasz1992 and @rany2.
wireless crash:
___ieee80211 crash:
followed by WARNING: CPU: 1 PID: 17561 at kthread_park+0x9c/0xb0:
after this my full wired segment also did not work atleast luci did not wanted to load anymore, but I was still able to ssh when I used reboot it rendered my network inaccessible I had to re-power in order to have it working again, I think the IEE crash might be a interesting one?
though it took a pretty long time these crashes happened it went right after this order when I was playing gta online on my Ayaneo Geek 1S (Intel AX210) which uses heavily udp streams, already in game for +2 hours.
Sorry, I have no other ideas than checking older versions (like 23.05 or mediatek-oem)
hi @Fail-Safe,
I got some news π, a few days ago I found a option inside my windows settings for the AX210 driver called 'Transmit Power' now I don't know exactly what this is since I only know routers to have this, maybe it is some type of flag to advertise something different to the AP to get more priority over other devices?
as default this was set on highest, i've turned it into lowest.
as result the crashing stopped from appearing, I seem to get it stable for 3 days now, I also was crashing with igmpproxy and avahi off, but im not entirely sure allmulticast was off too on my multi psk phys, I readed that it leaved so I assume it was.
this is the screenshot of the settings:
I find it interesting that this option altered the behaviour of crashing, sometimes I still seem to disconnect but all other wireless devices keep connected.
I'm not sure if this commit 513c131c6309712a51502870b041f45b4bd6a6d4, 14d5ee9f336923cf693ebf56d75bee41782f8112 also fixes it, I have been testing this before these 2 commits.
@xize I was experiencing something similiar with multi psk, it seems the allmulticast mode was also active.
@rany2
I'm also experiencing crashes with multi-psk on MT7981.
When using OpenWrt snapshot without any patches to the mt76 driver, the chip completely restarts on it's own and the wifi network appears in a couple of seconds. All clients including ones connected via the main PSK get disconnected.
Then I tried rany2/openwrt@18cc739 patch and 0x5a messages stop appearing but the chip still hangs, the driver shows 0x26 timeout and restarts.
I then tried to compile the rany2/openwrt fork and since it applies a bunch of patches, when the chip hangs, it manages to recover without disconnecting clients, but shows the following:
[ 447.275349] mt798x-wmac 18000000.wifi: send message 000130ed timeout, try again(1).
[ 447.283349] mt798x-wmac 18000000.wifi:
[ 447.283349] phy0 L1 SER recovery completed.
[ 447.821897] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000004
[ 447.828811] mt798x-wmac 18000000.wifi:
[ 447.828811] phy0 L1 SER recovery start.
[ 447.837695] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000008
[ 447.854270] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000010
[ 447.861219] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000020
[ 447.868360] mt798x-wmac 18000000.wifi:
[ 447.868360] phy0 L1 SER recovery completed.
I'm assuming that 0x00130ed is message type 0x30 MCU_EXT_CMD_GET_TX_STAT.
The same setup works on MT7613, MT7612, MT7615, MT7603, client optimized MT7921k (n, ac, ax), and appears to not hang even on MT7975 (Asus RT-AX53U) even though it uses the same mt7915e module.
So I'm assuming that this is a firmware bug, so I tried all five firmware versions published on mtk-feeds, and it's similar with all, but the crashes don't happen as often with the latest firmware.
If possible, can someone explain to me what's the difference between stations connected to the main AP interface vs ones connected to AP_VLAN interface? The keys are different, but why would it cause it to crash?
I wonder if it might be some type of overheating issue, like the chip first starts timing out as a form to throthling TX(?) and when it gets pushed even more it crashes.
then my question comes: is multicast one factor which pushes the chip?
I noticed when I changed the txpower entry on my windows device (AX210) the only device in my network which was subject to the crashing (yes I have 10+ devices), the crashing and seq messages stopped.
I have been continously playing for 3 days longer than 3 hours and I have not seen it re-appear.
but I have no clue how I need to check this nor confirm, it would be nice if someone can tell me some commands which I can output because this is a interesting heuristic. π
I wonder if it might be some type of overheating issue, like the chip first starts timing out as a form to throthling TX(?) and when it gets pushed even more it crashes.
IMO, this shouldn't be the case as transferring 100+GB with non-wds station doesn't cause any crashes with me, and there shouldn't be any reason for thermal issues to have anything to do with vlan stations.
then my question comes: is multicast one factor which pushes the chip?
AFAIK, multicast and broadcast is handled completely differently - these packets are sent using the lowest allowed rate so every station can receive them, changing them to unicast should improve compatibility with buggy firmware, but it doesn't for some reason.
I noticed when I changed the txpower entry on my windows device (AX210) the only device in my network which was subject to the crashing (yes I have 10+ devices), the crashing and seq messages stopped.
TX power didn't have anything to do in my case.. setting tx power to 1dBm or 22dBm didn't change anything.
@Fail-Safe @zekica https://github.com/blocktrron/mt76/commit/7447213e9e655f7bab6f45e54053747c2f1104e4 what about applying this patch?
@lukasz1992 Thanks for making us aware of that patch! I did apply it and re-enabled multicast_to_unicast_all
. I'm not seeing a full-on crash as of yet, but I'm seeing timeout messages showing up:
[ 318.691399] mt798x-wmac 18000000.wifi: Message 000800c4 (seq 3) timeout
[ 1040.369616] mt798x-wmac 18000000.wifi: Message 000026ed (seq 6) timeout
[ 1189.706201] mt798x-wmac 18000000.wifi: Message 000026ed (seq 7) timeout
[ 1210.164932] mt798x-wmac 18000000.wifi: Message 00002ced (seq 8) timeout
[ 1230.622232] mt798x-wmac 18000000.wifi: Message 00005aed (seq 9) timeout
[ 1251.080719] mt798x-wmac 18000000.wifi: Message 000026ed (seq 10) timeout
[ 1271.548743] mt798x-wmac 18000000.wifi: Message 00002ced (seq 11) timeout
[ 1291.996372] mt798x-wmac 18000000.wifi: Message 00005aed (seq 12) timeout
[ 1312.455490] mt798x-wmac 18000000.wifi: Message 000026ed (seq 13) timeout
[ 1332.912846] mt798x-wmac 18000000.wifi: Message 00002ced (seq 14) timeout
[ 1353.370877] mt798x-wmac 18000000.wifi: Message 00005aed (seq 15) timeout
[ 1373.830077] mt798x-wmac 18000000.wifi: Message 000026ed (seq 1) timeout
[ 1394.286904] mt798x-wmac 18000000.wifi: Message 000026ed (seq 10) timeout
[ 1414.755320] mt798x-wmac 18000000.wifi: Message 000800c4 (seq 11) timeout
[ 1435.203704] mt798x-wmac 18000000.wifi: Message 00002ced (seq 12) timeout
[ 1455.661331] mt798x-wmac 18000000.wifi: Message 00005aed (seq 13) timeout
[ 1476.120504] mt798x-wmac 18000000.wifi: Message 000026ed (seq 14) timeout
[ 1563.182051] mt798x-wmac 18000000.wifi: Message 00005aed (seq 14) timeout
[ 1865.864221] mt798x-wmac 18000000.wifi: Message 00005aed (seq 15) timeout
Oooof. Here we go βΉοΈ
...
[ 2897.289155] mt798x-wmac 18000000.wifi: Message 00005aed (seq 12) timeout
[ 2899.943303] mt798x-wmac 18000000.wifi: Message 00005aed (seq 15) timeout
[ 3154.323249] mt798x-wmac 18000000.wifi: Message 000026ed (seq 1) timeout
[ 4056.053187] mt798x-wmac 18000000.wifi: Message 00005aed (seq 9) timeout
[ 4622.974265] mt798x-wmac 18000000.wifi: Message 00005aed (seq 7) timeout
[ 4836.364011] ------------[ cut here ]------------
[ 4836.368632] WARNING: CPU: 2 PID: 8668 at ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[ 4836.377372] Modules linked in: nft_fib_inet nf_flow_table_inet iptable_nat xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7915e(O) mt76_connac_lib(O) mt76(O) mac80211(O) iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211(O) xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables tcp_bbr nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c compat(O) cls_flower act_vlan crypto_safexcel cls_bpf act_bpf sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact sha512_arm64 sha1_ce sha1_generic seqiv md5 geniv des_generic libdes authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd
[ 4836.377546] gpio_button_hotplug(O) usbcore usb_common aquantia
[ 4836.472635] CPU: 2 PID: 8668 Comm: kworker/u8:4 Tainted: G O 6.6.30 #0
[ 4836.480531] Hardware name: GL.iNet GL-MT6000 (DT)
[ 4836.485217] Workqueue: phy1 ieee80211_ba_session_work [mac80211]
[ 4836.491257] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 4836.498198] pc : ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[ 4836.504813] lr : ___ieee80211_stop_tx_ba_session+0x1d4/0x2f4 [mac80211]
[ 4836.511428] sp : ffffffc081273c80
[ 4836.514725] x29: ffffffc081273c80 x28: 0000000000000001 x27: ffffff800a623f00
[ 4836.521842] x26: ffffff800b6dc3b8 x25: ffffff80091e08a0 x24: ffffff80091e08a0
[ 4836.528959] x23: ffffffc078fbecc8 x22: ffffff80060080e8 x21: 0000000000000001
[ 4836.536074] x20: ffffff800a623f00 x19: ffffff800b6dc000 x18: 0000000000000000
[ 4836.543190] x17: 0000000000000100 x16: 001c000800000000 x15: 000102050028000d
[ 4836.550306] x14: 0000000000000000 x13: 0000000000000028 x12: 0000000000000002
[ 4836.557422] x11: 0000000000000040 x10: ffffffc080b67470 x9 : ffffffc080b67468
[ 4836.564537] x8 : ffffff8000401020 x7 : 0000000000000000 x6 : 0000000d573ff195
[ 4836.571652] x5 : 0000000001000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 4836.578767] x2 : 0000000000000001 x1 : 0000000000000002 x0 : 00000000fffffff4
[ 4836.585883] Call trace:
[ 4836.588314] ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[ 4836.594583] ieee80211_ba_session_work+0x418/0x444 [mac80211]
[ 4836.600333] process_one_work+0x154/0x2a0
[ 4836.604333] worker_thread+0x2ac/0x48c
[ 4836.608067] kthread+0xdc/0xe8
[ 4836.611110] ret_from_fork+0x10/0x20
[ 4836.614678] ---[ end trace 0000000000000000 ]---
[ 5044.308637] mt798x-wmac 18000000.wifi: Message 00002ced (seq 10) timeout
[ 5048.806140] mt798x-wmac 18000000.wifi: Message 00005aed (seq 7) timeout
[ 5201.019771] mt798x-wmac 18000000.wifi: Message 00005aed (seq 12) timeout
[ 5217.597460] mt798x-wmac 18000000.wifi: Message 00005aed (seq 14) timeout
[ 5320.095313] mt798x-wmac 18000000.wifi: Message 00005aed (seq 13) timeout
[ 5433.228725] mt798x-wmac 18000000.wifi: Message 000025ed (seq 4) timeout
[ 5437.221749] mt798x-wmac 18000000.wifi: Message 00005aed (seq 2) timeout
[ 5624.001329] ------------[ cut here ]------------
[ 5624.005946] WARNING: CPU: 0 PID: 9449 at ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[ 5624.014675] Modules linked in: nft_fib_inet nf_flow_table_inet iptable_nat xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7915e(O) mt76_connac_lib(O) mt76(O) mac80211(O) iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211(O) xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables tcp_bbr nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c compat(O) cls_flower act_vlan crypto_safexcel cls_bpf act_bpf sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact sha512_arm64 sha1_ce sha1_generic seqiv md5 geniv des_generic libdes authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd
[ 5624.014840] gpio_button_hotplug(O) usbcore usb_common aquantia
[ 5624.109924] CPU: 0 PID: 9449 Comm: kworker/u8:6 Tainted: G W O 6.6.30 #0
[ 5624.117817] Hardware name: GL.iNet GL-MT6000 (DT)
[ 5624.122504] Workqueue: phy1 ieee80211_ba_session_work [mac80211]
[ 5624.128517] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 5624.135458] pc : ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[ 5624.142073] lr : ___ieee80211_stop_tx_ba_session+0x1d4/0x2f4 [mac80211]
[ 5624.148688] sp : ffffffc0811dbc80
[ 5624.151986] x29: ffffffc0811dbc80 x28: 0000000000000001 x27: ffffff8009901cc0
[ 5624.159102] x26: ffffff80009803b8 x25: ffffff80091e08a0 x24: ffffff80091e08a0
[ 5624.166218] x23: ffffffc078fbecc8 x22: ffffff80009840e8 x21: 0000000000000001
[ 5624.173335] x20: ffffff8009901cc0 x19: ffffff8000980000 x18: 0000000000000070
[ 5624.180450] x17: ffffffbfbf247000 x16: ffffffc080000000 x15: 00005aa8af65510d
[ 5624.187565] x14: 00005aa8af65510d x13: 0000000000000001 x12: 0000000000000002
[ 5624.194681] x11: 0000000000000040 x10: ffffffc080b67470 x9 : ffffffc080b67468
[ 5624.201796] x8 : ffffff8000401020 x7 : 0000000000000000 x6 : 0000000d573ff195
[ 5624.208911] x5 : 0000000001000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 5624.216027] x2 : 0000000000000001 x1 : 0000000000000002 x0 : 00000000fffffff4
[ 5624.223142] Call trace:
[ 5624.225575] ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[ 5624.231843] ieee80211_ba_session_work+0x418/0x444 [mac80211]
[ 5624.237592] process_one_work+0x154/0x2a0
[ 5624.241590] worker_thread+0x2ac/0x48c
[ 5624.245325] kthread+0xdc/0xe8
[ 5624.248367] ret_from_fork+0x10/0x20
[ 5624.251928] ---[ end trace 0000000000000000 ]---
[ 5680.260541] phy1-ap1: HW problem - can not stop rx aggregation for 20:69:80:xx:xx:xx tid 6
[ 5985.397401] mt798x-wmac 18000000.wifi: Message 00005aed (seq 8) timeout
[ 6852.354358] mt798x-wmac 18000000.wifi: Message 00005aed (seq 8) timeout
[ 7939.418743] mt798x-wmac 18000000.wifi: Message 00005aed (seq 10) timeout
[ 8721.653210] ------------[ cut here ]------------
[ 8721.657829] WARNING: CPU: 1 PID: 10406 at ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[ 8721.666644] Modules linked in: nft_fib_inet nf_flow_table_inet iptable_nat xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7915e(O) mt76_connac_lib(O) mt76(O) mac80211(O) iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211(O) xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables tcp_bbr nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c compat(O) cls_flower act_vlan crypto_safexcel cls_bpf act_bpf sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact sha512_arm64 sha1_ce sha1_generic seqiv md5 geniv des_generic libdes authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd
[ 8721.666817] gpio_button_hotplug(O) usbcore usb_common aquantia
[ 8721.761905] CPU: 1 PID: 10406 Comm: kworker/u8:4 Tainted: G W O 6.6.30 #0
[ 8721.769888] Hardware name: GL.iNet GL-MT6000 (DT)
[ 8721.774575] Workqueue: phy1 ieee80211_ba_session_work [mac80211]
[ 8721.780620] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 8721.787561] pc : ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[ 8721.794177] lr : ___ieee80211_stop_tx_ba_session+0x1d4/0x2f4 [mac80211]
[ 8721.800791] sp : ffffffc0811f3c80
[ 8721.804089] x29: ffffffc0811f3c80 x28: 0000000000000001 x27: ffffff8010154c00
[ 8721.811206] x26: ffffff80011743b8 x25: ffffff80091e08a0 x24: ffffff80091e08a0
[ 8721.818322] x23: ffffffc078fbecc8 x22: ffffff80011700e8 x21: 0000000000000001
[ 8721.825438] x20: ffffff8010154c00 x19: ffffff8001174000 x18: ffffffffffffc8f8
[ 8721.832553] x17: ffffffffffffc800 x16: 0000000000006838 x15: 00000000000040f8
[ 8721.839669] x14: 0000000100010400 x13: 0000000000000000 x12: 0000000000000002
[ 8721.846784] x11: 0000000000000040 x10: ffffffc080b67470 x9 : ffffffc080b67468
[ 8721.853900] x8 : ffffff8000401020 x7 : 0000000000000000 x6 : 0000000d573ff195
[ 8721.861015] x5 : 0000000001000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 8721.868131] x2 : 0000000000000001 x1 : 0000000000000002 x0 : 00000000fffffff4
[ 8721.875247] Call trace:
[ 8721.877680] ___ieee80211_stop_tx_ba_session+0x2b4/0x2f4 [mac80211]
[ 8721.883949] ieee80211_ba_session_work+0x418/0x444 [mac80211]
[ 8721.889698] process_one_work+0x154/0x2a0
[ 8721.893697] worker_thread+0x2ac/0x48c
[ 8721.897431] kthread+0xdc/0xe8
[ 8721.900474] ret_from_fork+0x10/0x20
[ 8721.904036] ---[ end trace 0000000000000000 ]---
Maybe this issue is a duplicate of https://github.com/openwrt/mt76/issues/690 ?
Maybe related, but I wouldn't say it's duplicate: the message shown is the same, but the underlying cause is probably not the same. #690 happens on MT7915 (MT7905+MT7975) PCIe device while this issue is for MT7981/MT7986 SoC.
The firmware running on the MCU is not the same. It's just that messages 0x26 and 0x5a are most often being exchanged with both.
[ 54.874089] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000004
[ 54.880981] mt798x-wmac 18000000.wifi:
[ 54.880981] phy0 L1 SER recovery start.
[ 54.888622] mt798x-wmac 18000000.wifi: Message 000025ed (seq 11) timeout
[ 54.889422] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000008
[ 55.131284] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000010
[ 55.138214] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000020
[ 55.395327] mt798x-wmac 18000000.wifi: send message 000025ed timeout, try again(1).
[ 55.404744] mt798x-wmac 18000000.wifi:
[ 55.404744] phy0 L1 SER recovery completed.
If your scenario is about IGMP/MLD multicast, you shall use the folllowing ways Check IGMP snooping status by: cat /sys/class/net/br-lan/bridge/multicast_snooping Check IGMP snooping multicast-to-unicast by: cat /sys/class/net/br-lan/brif/phyx-apx/multicast_to_unicast
Linux Upstream Commit: https://github.com/torvalds/linux/commit/6db6f0eae6052b70885562e1733896647ec1d807
For MAC80211 Multicast-to-Unicast feature, we haven't test it and if it leads the firmware hang or system error recovery log show up, it might due to the unexpected 802.11 unicast frame after converting.
@evelyn3648 hi, can you also take a look at the ap_vlan issue #881. I have narrowed it down to the firmware crashing when sending software GTK encrypted pakets (multicast/broadcast packets sent to ap_vlan interface) while the receive queue is full.
[ 2490.845304] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000004
[ 2490.852186] mt798x-wmac 18000000.wifi:
[ 2490.852186] phy0 L1 SER recovery start.
[ 2490.852186] mt798x-wmac 18000000.wifi: Message 0000aded (seq 14) timeout
[ 2490.852192] mt798x-wmac 18000000.wifi: send message 0000aded timeout, try again(1).
[ 2490.860618] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000008
[ 2490.866527] mt798x-wmac 18000000.wifi: Message 0000aded (seq 15) timeout
[ 2490.882935] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000010
[ 2490.894714] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000020
[ 2491.325319] mt798x-wmac 18000000.wifi: send message 0000aded timeout, try again(2).
[ 2491.334460] mt798x-wmac 18000000.wifi:
[ 2491.334460] phy0 L1 SER recovery completed.
[73644.244424] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000004
[73644.251314] mt798x-wmac 18000000.wifi:
[73644.251314] phy0 L1 SER recovery start.
[73644.258995] mt798x-wmac 18000000.wifi: Message 00005aed (seq 3) timeout
[73644.259748] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000008
[73644.281249] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000010
[73644.288185] mt798x-wmac 18000000.wifi: phy0 SER recovery state: 0x00000020
[73644.761290] mt798x-wmac 18000000.wifi: send message 00005aed timeout, try again(1).
[73644.771618] mt798x-wmac 18000000.wifi:
[73644.771618] phy0 L1 SER recovery completed.
@Fail-Safe @Headcrabed @lukasz1992 I have an interesting observation to share regarding this:
The crash is caused with unicast AP to Station (TX) packets with IP packet length of 482 or less. Packets with 483 or more bytes never cause a crash.
This only happens for packets sent via ieee80211_subif_start_xmit or ieee80211_convert_to_unicast. Packets for stations in the main AP are sent via ieee80211_8023_xmit, not sure why.
Disabling multicast-to-unicast disables this path in mac80211 and works around this problem, but is the only path available for ap_vlan - my issue #881
I found a workaround for my issue and wrote on #881
The issue there surfacing this underlying issue is that mac80211 doesn't replace default ieee80211_dataif_ops with offloaded ieee80211_dataif_8023_ops on ap_vlan interfaces.
The issue here is the underlying one and still has to be investigated.
A workaround here may be to force those converted unicast packets to be transmitted via ieee80211_8023_xmit somehow.
Does the crash also happens on my version with some patches: https://github.com/lukasz1992/openwrt/tree/v23.05.3-lukasz1992 ?
Mediatek just dropped new firmware: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/0fdbc0e6d84bbc0216da2842a494bdf01f745c6c
The release notes claims "Fix MAC80211 multicast-to-unicast issue"
Glad to see that newest firmware already added to openwrtβs mt76 repo.
Mediatek just dropped new firmware: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/0fdbc0e6d84bbc0216da2842a494bdf01f745c6c
The release notes claims "Fix MAC80211 multicast-to-unicast issue"
I'm testing the new firmware for a few hours now and no crashes! This is looking good!
More verbose updates here: https://forum.openwrt.org/t/mt6000-custom-build-with-luci-and-some-optimization-kernel-6-6-x/185241/947?u=_failsafe
Over 1 day and 4 hours of uptime with no crashes to be seen. (!!!) I'd say we can finally put this issue to bed with the fix being the updated firmware as released here: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/0fdbc0e6d84bbc0216da2842a494bdf01f745c6c
Thanks to the Mediatek devs who figured this one out! π»π
Several OpenWrt users have reported an issue with having
multicast_to_unicast_all
set on mt798x hardware. The issue presents as:I was able to narrow down the issue to the
multicast_to_unicast_all
setting here: https://forum.openwrt.org/t/mt798x-wmac-18000000-wifi-message-xxxxxxxx-seq-5-timeout/175163/6?u=_failsafeSeveral users have confirmed unsetting (disabling) this option avoids the crash for them as well.
This happens to me on snapshot build r25580-85ad6b9569. Hoping others can chime in here with any other particulars that might help narrow this down further.