coolsnowwolf / lede

Lean's LEDE source
Other
29.48k stars 19.49k forks source link

[红米AX6000] 打开数据包负载均衡时,大带宽吞吐会暂时断网 #10241

Open xsm1997 opened 1 year ago

xsm1997 commented 1 year ago

反馈bug/问题模板,提建议请删除

1.关于你要提交的问题

Q:是否搜索了issue (使用 "x" 选择)

红米AX6000,在开启数据包负载均衡时,进行大带宽吞吐,发生暂时断网问题。

2. 详细叙述

(1) 具体问题

开启数据包负载均衡,或只开启eth0的rps时,进行大带宽吞吐,红米AX6000会发生watchdog timeout。然后网卡会暂时下线,几秒钟后重新上线,导致暂时断网。

测试环境为内网xray服务器,通过某插件透明代理。仅在多线程(超过8线程)下载时,CPU负载几乎占满,非常容易发生此现象。使用单线程下载时,几乎不发生此现象。同时,关闭数据包负载均衡,或只关闭eth0的rps,此问题不再复现,但同时速度也不像开启rps时那么快了(开启rps时多线程下载是\~110MB/s,关闭rps时多线程下载是\~90MB/s,单线程下载是\~110MB/s)。

是否考虑为我的这台路由器的个体硬件问题?(之前买过一个RK3399开发板,其内置无线网卡(通过USB2.0转接)也发生过类似问题,速率一快就断联,表现为USB断开连接,最后加焊无线网卡芯片后解决问题。) 更新:朋友的另一台红米AX6000也测试出此问题,个例的概率很小了。

已测试本repo,和x-wrt,均有类似问题。

(2) 路由器型号和固件版本

红米AX6000,commit 3929a9889f0ec47c462e81cc846c539b45130772

(3) 详细日志

Sat Oct  8 18:23:31 2022 kern.warn kernel: [ 1293.765000] ------------[ cut here ]------------
Sat Oct  8 18:23:31 2022 kern.info kernel: [ 1293.769622] NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out
Sat Oct  8 18:23:31 2022 kern.warn kernel: [ 1293.776595] WARNING: CPU: 2 PID: 0 at dev_watchdog+0x30c/0x314
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.782421] Modules linked in: xt_FULLCONENAT pppoe ppp_async wireguard pppox ppp_mppe ppp_generic mt7915e mt76_connac_lib mt76 mac80211 libchacha20poly1305 ipt_REJECT chacha_neon cfg80211 xt_time xt_tcpudp xt_state xt_socket xt_recent xt_quota xt_pkttype xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_iprange xt_helper xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_cgroup xt_addrtype xt_TPROXY xt_TCPMSS xt_REDIRECT xt_MASQUERADE xt_LOG xt_FLOWOFFLOAD xt_CT ts_fsm ts_bm tcp_bbr slhc poly1305_neon nf_tproxy_ipv6 nf_tproxy_ipv4 nf_socket_ipv6 nf_socket_ipv4 nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_pptp nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_log_syslog nf_flow_table nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_netlink nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_broadcast ts_kmp nf_conntrack_amanda nf_conncount macvlan libcurve25519_generic libchacha iptable_raw
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.782563]  iptable_nat iptable_mangle iptable_filter ip_tables hwmon crc_ccitt compat cls_flower asn1_decoder act_vlan crypto_safexcel cls_bpf act_bpf sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact cryptodev xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_NPT ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ip6_udp_tunnel udp_tunnel sit tunnel4 ip_tunnel tun zram zsmalloc crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha1_generic seqiv md5 des_generic libdes authenc arc4 leds_gpio gpio_button_hotplug
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.954475] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.15.72 #0
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.960463] Hardware name: Xiaomi Redmi Router AX6000 (DT)
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.965928] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.972868] pc : dev_watchdog+0x30c/0x314
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.976868] lr : dev_watchdog+0x30c/0x314
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.980861] sp : ffffffc008c1bda0
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.984160] x29: ffffffc008c1bda0 x28: 0000000000000140 x27: 00000000ffffffff
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.991276] x26: 0000000000000000 x25: 0000000000000002 x24: 0000000000000000
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1293.998392] x23: ffffff8000852480 x22: 0000000000000001 x21: ffffffc008ae6000
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.005508] x20: ffffff8000852000 x19: 0000000000000000 x18: 0000000000000116
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.012624] x17: ffffffc0173e8000 x16: ffffffc008c1c000 x15: ffffffc008afa220
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.019740] x14: 0000000000000342 x13: 0000000000000116 x12: ffffffc008c1bac8
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.026855] x11: ffffffc008b52220 x10: 00000000fffff000 x9 : ffffffc008b52220
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.033971] x8 : 0000000000000000 x7 : ffffffc008afa220 x6 : 0000000000000001
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.041086] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.048201] x2 : ffffff801feb6080 x1 : ffffffc0173e8000 x0 : 000000000000003f
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.055317] Call trace:
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.057750]  dev_watchdog+0x30c/0x314
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.061398]  call_timer_fn.constprop.0+0x24/0x80
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.066002]  __run_timers.part.0+0x20c/0x28c
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.070254]  run_timer_softirq+0x3c/0x74
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.074161]  _stext+0x124/0x2a0
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.077288]  __irq_exit_rcu+0xe0/0x100
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.081025]  irq_exit+0x10/0x20
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.084150]  handle_domain_irq+0x64/0x90
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.088057]  gic_handle_irq+0x54/0x130
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.091793]  call_on_irq_stack+0x28/0x54
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.095700]  do_interrupt_handler+0x54/0x60
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.099866]  el1_interrupt+0x30/0x50
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.103427]  el1h_64_irq_handler+0x18/0x24
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.107507]  el1h_64_irq+0x74/0x78
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.110893]  arch_cpu_idle+0x18/0x2c
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.114453]  do_idle+0xc4/0x144
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.117580]  cpu_startup_entry+0x24/0x60
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.121486]  secondary_start_kernel+0x134/0x144
Sat Oct  8 18:23:31 2022 kern.debug kernel: [ 1294.126000]  __secondary_switched+0x90/0x94
Sat Oct  8 18:23:31 2022 kern.warn kernel: [ 1294.130167] ---[ end trace b21df8e2122796f1 ]---
coolsnowwolf commented 1 year ago

可以尝试关闭 fullcone 测试

xsm1997 commented 1 year ago

可以尝试关闭 fullcone 测试

关了fullconenat也是一样。我这里是内网测试的,是lan到lan,不经过wan,应该跟nat没关系。

xsm1997 commented 1 year ago

貌似关闭eth0的tso可以解决这个问题。

命令:ethtool -K eth0 tso off 需要ethtool包。

实测关闭网卡的tso不会影响到hwnat,跑千兆直连还是几乎0 cpu占用。

xsm1997 commented 1 year ago

@coolsnowwolf 是否考虑在网卡驱动没解决此问题之前,手动在repo中关闭这个机型的tso?

coolsnowwolf commented 1 year ago

可以写个 workaround

xiangfeidexiaohuo commented 1 year ago

你编译出来的 能在线升级吗

iseeyou commented 1 year ago

你编译出来的 能在线升级吗

我编译的不行,每次都要刷回官方固件重新走一遍,很麻烦

iseeyou commented 1 year ago

我编译的打开Turbo ACC 软件流量分载 硬件流量分载 会出现重启现象 关闭运行就稳定了

ramondelee commented 1 year ago

我删除了无线,但没有重启现象,我的现象是PPPOE连接5天左右会掉,自动重连多次才能成功。

yjd commented 1 year ago

编译的k2p固件,刷了2台没问题默认tso也是开着,今天收了一台刷同一个固件遇到speedtest测速下载正常,测到上传立马断线。路由只能断电重启。搜到这贴关了tso解决,谢了! 启动后自动禁用可以参考这个 https://forum.openwrt.org/t/how-to-make-ethtool-setting-persistent-on-br-lan/6433