sbwml / luci-app-mosdns

一个 DNS 转发器 - OpenWrt 🎁 MosDNS v5 is Ready! 🎉
https://github.com/IrineSistiana/mosdns
1.2k stars 232 forks source link

启动过早,导致特定设备死机重启 #253

Closed xlighting2017 closed 1 month ago

xlighting2017 commented 1 month ago

目前 mosdns 的启动START 是 51,

luci-app-mosdns/root/etc/init.d/mosdns

START=51
USE_PROCD=1

/etc/rc.d/S51mosdns

在某些设备上(ipq60xx, 京东云雅典娜), 过早的启动mosdns会导致 死机重启(我手中这块单板可以100%触发) 触发时会提示以下日志, 并引起 kernel panic

[   17.632185] ath11k_pci 0000:01:00.0: failed to vdev 0 create peer for AP: -110
[   23.008175] ath11k_pci 0000:01:00.0: Timeout in receiving vdev delete response
[   26.001854] qca-nss 39000000.nss: Configuring additional NSS pbufs
[   26.010360] qca-nss 39000000.nss: Additional pbufs of size 3100672 got added to NSS
[   26.042684] Unable to handle kernel read from unreadable memory at virtual address 0000000000000000
[   26.042739] Mem abort info:
[   26.050622]   ESR = 0x0000000096000005
[   26.053365]   EC = 0x25: DABT (current EL), IL = 32 bits
[   26.057178]   SET = 0, FnV = 0
[   26.062669]   EA = 0, S1PTW = 0
[   26.065508]   FSC = 0x05: level 1 translation fault
[   26.068564] Data abort info:
[   26.073392]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[   26.076524]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[   26.081820]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[   26.086950] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000058d95000
[   26.092318] [0000000000000000] pgd=08000000567a0003, p4d=08000000567a0003, pud=08000000567a0003, pmd=0000000000000000
[   26.098673] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
[   26.109231] Modules linked in: ecm(O) jitterentropy_rng ath11k_pci(O) ath11k_ahb(O) ath11k(O) xt_DSCP wireguard nft_redir nft_nat nft_masq nft_fullcone(O) nft_flow_offload nft_fib_inet nft_ct nft_chain_nat nf_nat nf_flow_table_inet nf_flow_table nf_conntrack_netlink nf_conntrack mac80211(O) libchacha20poly1305 ipt_REJECT chacha_neon cfg80211(O) xt_time xt_tcpudp xt_tcpmss xt_statistic xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_comment xt_TCPMSS xt_LOG xt_HL xt_CLASSIFY sch_cake qrtr_smd qrtr_mhi qrtr qmi_helpers(O) ppp_mppe ppp_async poly1305_neon nft_tproxy nft_socket nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_quota nft_queue nft_numgen nft_log nft_limit nft_hash nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_compat nfnetlink_queue nf_tproxy_ipv6 nf_tproxy_ipv4 nf_tables nf_socket_ipv6 nf_socket_ipv4 nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv4 mhi macvlan libcurve25519_generic libcrc32c libchacha l2tp_ppp iptable_mangle iptable_filter ipt_ECN ip_tables compat(O)
[   26.109421]  sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact qca_nss_tunipip6(O) qca_nss_tun6rd(O) qca_nss_wifi_meshmgr(O) qca_nss_vxlanmgr(O) qca_nss_pptp(O) pptp qca_nss_pppoe(O) pppoe pppox qca_nss_map_t(O) qca_nss_lag_mgr(O) qca_nss_l2tpv2(O) ppp_generic slhc qca_nss_gre(O) qca_nss_bridge_mgr(O) qca_nss_vlan(O) xt_set x_tables ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ipmac ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink qca_mcs(O) bonding ip6_gre ip_gre gre ifb nat46(O) nf_defrag_ipv6 sit qca_nss_drv(O) ip6_tunnel tunnel6 tunnel4 udp_diag tcp_diag raw_diag inet_diag tun zram zsmalloc vxlan crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_generic sha512_arm64 sha1_ce seqiv sha3_generic
[   26.183182]  drbg michael_mic hmac geniv cmac arc4 leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom qca_nss_dp(O) qca_ssdk(O) gpio_button_hotplug(O) f2fs ext4 mbcache jbd2 aquantia hwmon crc_ccitt crc32c_generic crc32_generic
[   26.295199] CPU: 0 PID: 4061 Comm: dnsmasq Tainted: G           O       6.6.47 #0
[   26.315830] Hardware name: JDCloud AX6600 (DT)
[   26.323289] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   26.327634] pc : ath11k_peer_map_v2_event+0x40/0x434 [ath11k]
[   26.334490] lr : ath11k_dp_htt_htc_t2h_msg_handler+0x94/0x998 [ath11k]
[   26.340394] sp : ffffffc080003c10
[   26.346812] x29: ffffffc080003c10 x28: 0000000000000020 x27: 0000000000000001
[   26.350204] x26: 0000000000000000 x25: ffffff80072e35b0 x24: 000000000061e3fd
[   26.357322] x23: 0000000000000000 x22: 000000000001001e x21: ffffffc080003d70
[   26.364440] x20: 0000000000000000 x19: ffffff800460a200 x18: 0000000000000000
[   26.371559] x17: ffffffbfbf2df000 x16: ffffffc080000000 x15: 0000000000000000
[   26.378676] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[   26.385794] x11: ffffff8006ae801c x10: 0000000000000049 x9 : ffffffc0808a8850
[   26.392912] x8 : ffffffc08137c000 x7 : 0000000000000940 x6 : 0000000000000000
[   26.400029] x5 : 0000000000000061 x4 : 0000000000000061 x3 : 0000000000009b48
[   26.407147] x2 : 0000000000005dc0 x1 : 0000000000000000 x0 : 0000000000000061
[   26.414266] Call trace:
[   26.421375]  ath11k_peer_map_v2_event+0x40/0x434 [ath11k]
[   26.423638]  ath11k_dp_htt_htc_t2h_msg_handler+0x94/0x998 [ath11k]
[   26.429193]  ath11k_htc_rx_completion_handler+0x398/0x53c [ath11k]
[   26.435270]  ath11k_ce_per_engine_service+0x210/0x2f0 [ath11k]
[   26.441432]  ath11k_pcic_ext_irq_enable+0x1ac/0x298 [ath11k]
[   26.447248]  tasklet_action_common.isra.0+0x110/0x148
[   26.453063]  tasklet_action+0x24/0x30
[   26.458009]  handle_softirqs+0xfc/0x230
[   26.461655]  __do_softirq+0x14/0x20
[   26.465299]  ____do_softirq+0x10/0x1c
[   26.468773]  call_on_irq_stack+0x24/0x4c
[   26.472593]  do_softirq_own_stack+0x1c/0x28
[   26.476586]  irq_exit_rcu+0x90/0xc8
[   26.480491]  el0_interrupt+0x48/0xb0
[   26.483963]  __el0_irq_handler_common+0x18/0x24
[   26.487784]  el0t_64_irq_handler+0x10/0x1c
[   26.492036]  el0t_64_irq+0x178/0x17c
[   26.496206] Code: 12001cda a90573fb 12003c5b d28bb802 (f9400299) 
[   26.499941] ---[ end trace 0000000000000000 ]---
[   26.505928] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[   26.510619] SMP: stopping secondary CPUs
[   26.517301] Kernel Offset: disabled
[   26.521376] CPU features: 0x0,00000000,10000000,0000400b
[   26.524592] Memory Limit: none
[   27.130168] Rebooting in 3 seconds..

推测是 驱动尚未初始化完成,就发送数据包,或与其他启动服务存在一定的 race_condition

将mosdns的 START改为99后(实际是 mv /etc/rc.d/S51mosdns /etc/rc.d/S99mosdns),现象即可消失,不再kp/重启

所以想问问: 这个 START=51 是由于特殊原因/历史问题/某种考量,所以设置为比较早启动, 还是可以 修改为 99?

sbwml commented 1 month ago

这个东西我能接受的最大优先级是 S75,你可以试试 75 这个数字能不能用,如果还不能用,那么这里的init.d 就没必要改了,你只能修改下固件文件。

发自我的iPhone

在 2024年9月16日,12:01,xlighting2017 @.***> 写道:

 目前 mosdns 的启动START 是 51,

luci-app-mosdns/root/etc/init.d/mosdns

START=51 USE_PROCD=1 即 /etc/rc.d/S51mosdns

在某些设备上(ipq60xx, 京东云雅典娜), 过早的启动mosdns会导致 死机重启(我手中这块单板可以100%触发) 触发时会提示以下日志, 并引起 kernel panic

[ 17.632185] ath11k_pci 0000:01:00.0: failed to vdev 0 create peer for AP: -110 [ 23.008175] ath11k_pci 0000:01:00.0: Timeout in receiving vdev delete response [ 26.001854] qca-nss 39000000.nss: Configuring additional NSS pbufs [ 26.010360] qca-nss 39000000.nss: Additional pbufs of size 3100672 got added to NSS [ 26.042684] Unable to handle kernel read from unreadable memory at virtual address 0000000000000000 [ 26.042739] Mem abort info: [ 26.050622] ESR = 0x0000000096000005 [ 26.053365] EC = 0x25: DABT (current EL), IL = 32 bits [ 26.057178] SET = 0, FnV = 0 [ 26.062669] EA = 0, S1PTW = 0 [ 26.065508] FSC = 0x05: level 1 translation fault [ 26.068564] Data abort info: [ 26.073392] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 26.076524] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 26.081820] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 26.086950] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000058d95000 [ 26.092318] [0000000000000000] pgd=08000000567a0003, p4d=08000000567a0003, pud=08000000567a0003, pmd=0000000000000000 [ 26.098673] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP [ 26.109231] Modules linked in: ecm(O) jitterentropy_rng ath11k_pci(O) ath11k_ahb(O) ath11k(O) xt_DSCP wireguard nft_redir nft_nat nft_masq nft_fullcone(O) nft_flow_offload nft_fib_inet nft_ct nft_chain_nat nf_nat nf_flow_table_inet nf_flow_table nf_conntrack_netlink nf_conntrack mac80211(O) libchacha20poly1305 ipt_REJECT chacha_neon cfg80211(O) xt_time xt_tcpudp xt_tcpmss xt_statistic xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_comment xt_TCPMSS xt_LOG xt_HL xt_CLASSIFY sch_cake qrtr_smd qrtr_mhi qrtr qmi_helpers(O) ppp_mppe ppp_async poly1305_neon nft_tproxy nft_socket nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_quota nft_queue nft_numgen nft_log nft_limit nft_hash nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_compat nfnetlink_queue nf_tproxy_ipv6 nf_tproxy_ipv4 nf_tables nf_socket_ipv6 nf_socket_ipv4 nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv4 mhi macvlan libcurve25519_generic libcrc32c libchacha l2tp_ppp iptable_mangle iptable_filter ipt_ECN ip_tables compat(O) [ 26.109421] sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact qca_nss_tunipip6(O) qca_nss_tun6rd(O) qca_nss_wifi_meshmgr(O) qca_nss_vxlanmgr(O) qca_nss_pptp(O) pptp qca_nss_pppoe(O) pppoe pppox qca_nss_map_t(O) qca_nss_lag_mgr(O) qca_nss_l2tpv2(O) ppp_generic slhc qca_nss_gre(O) qca_nss_bridge_mgr(O) qca_nss_vlan(O) xt_set x_tables ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ipmac ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink qca_mcs(O) bonding ip6_gre ip_gre gre ifb nat46(O) nf_defrag_ipv6 sit qca_nss_drv(O) ip6_tunnel tunnel6 tunnel4 udp_diag tcp_diag raw_diag inet_diag tun zram zsmalloc vxlan crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_generic sha512_arm64 sha1_ce seqiv sha3_generic [ 26.183182] drbg michael_mic hmac geniv cmac arc4 leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom qca_nss_dp(O) qca_ssdk(O) gpio_button_hotplug(O) f2fs ext4 mbcache jbd2 aquantia hwmon crc_ccitt crc32c_generic crc32_generic [ 26.295199] CPU: 0 PID: 4061 Comm: dnsmasq Tainted: G O 6.6.47 #0 [ 26.315830] Hardware name: JDCloud AX6600 (DT) [ 26.323289] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 26.327634] pc : ath11k_peer_map_v2_event+0x40/0x434 [ath11k] [ 26.334490] lr : ath11k_dp_htt_htc_t2h_msg_handler+0x94/0x998 [ath11k] [ 26.340394] sp : ffffffc080003c10 [ 26.346812] x29: ffffffc080003c10 x28: 0000000000000020 x27: 0000000000000001 [ 26.350204] x26: 0000000000000000 x25: ffffff80072e35b0 x24: 000000000061e3fd [ 26.357322] x23: 0000000000000000 x22: 000000000001001e x21: ffffffc080003d70 [ 26.364440] x20: 0000000000000000 x19: ffffff800460a200 x18: 0000000000000000 [ 26.371559] x17: ffffffbfbf2df000 x16: ffffffc080000000 x15: 0000000000000000 [ 26.378676] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 26.385794] x11: ffffff8006ae801c x10: 0000000000000049 x9 : ffffffc0808a8850 [ 26.392912] x8 : ffffffc08137c000 x7 : 0000000000000940 x6 : 0000000000000000 [ 26.400029] x5 : 0000000000000061 x4 : 0000000000000061 x3 : 0000000000009b48 [ 26.407147] x2 : 0000000000005dc0 x1 : 0000000000000000 x0 : 0000000000000061 [ 26.414266] Call trace: [ 26.421375] ath11k_peer_map_v2_event+0x40/0x434 [ath11k] [ 26.423638] ath11k_dp_htt_htc_t2h_msg_handler+0x94/0x998 [ath11k] [ 26.429193] ath11k_htc_rx_completion_handler+0x398/0x53c [ath11k] [ 26.435270] ath11k_ce_per_engine_service+0x210/0x2f0 [ath11k] [ 26.441432] ath11k_pcic_ext_irq_enable+0x1ac/0x298 [ath11k] [ 26.447248] tasklet_action_common.isra.0+0x110/0x148 [ 26.453063] tasklet_action+0x24/0x30 [ 26.458009] handle_softirqs+0xfc/0x230 [ 26.461655] __do_softirq+0x14/0x20 [ 26.465299] __do_softirq+0x10/0x1c [ 26.468773] call_on_irq_stack+0x24/0x4c [ 26.472593] do_softirq_own_stack+0x1c/0x28 [ 26.476586] irq_exit_rcu+0x90/0xc8 [ 26.480491] el0_interrupt+0x48/0xb0 [ 26.483963] el0_irq_handler_common+0x18/0x24 [ 26.487784] el0t_64_irq_handler+0x10/0x1c [ 26.492036] el0t_64_irq+0x178/0x17c [ 26.496206] Code: 12001cda a90573fb 12003c5b d28bb802 (f9400299) [ 26.499941] ---[ end trace 0000000000000000 ]--- [ 26.505928] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 26.510619] SMP: stopping secondary CPUs [ 26.517301] Kernel Offset: disabled [ 26.521376] CPU features: 0x0,00000000,10000000,0000400b [ 26.524592] Memory Limit: none [ 27.130168] Rebooting in 3 seconds.. 推测是 驱动尚未初始化完成,就发送数据包,或与其他启动服务存在一定的 race_condition

将mosdns的 START改为99后(实际是 mv /etc/rc.d/S51mosdns /etc/rc.d/S99mosdns),现象即可消失,不再kp/重启

所以想问问: 这个 START=51 是由于特殊原因/历史问题/某种考量,所以设置为比较早启动, 还是可以 修改为 99?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

xlighting2017 commented 1 month ago

改成 75 也可以正常启动, 如果可以将默认值设置为 75,我就不用 fork 一份了,感谢!

sbwml commented 1 month ago

那就 75

xlighting2017 commented 1 month ago

closed in https://github.com/sbwml/luci-app-mosdns/commit/b7d0bec38ff301df4c397485d3d7731943ebc912