anaelorlinski / OpenWrt-NanoPi-R2S-R4S-Builds

OpenWRT Builds for NanoPi R2S & R4S from official Openwrt source code with minimal set of patches
MIT License
170 stars 57 forks source link

Kernel OOPS in clk_change_rate #19

Closed koolkhel closed 2 years ago

koolkhel commented 2 years ago

Hello!

I'm having quite regular OOPS on NanoPi R4S which later make all disk IO stop:

[22511.247464] Unable to handle kernel paging request at virtual address fbff0000f2150160
[22511.248157] Mem abort info:
[22511.248403]   ESR = 0x96000004
[22511.248672]   EC = 0x25: DABT (current EL), IL = 32 bits
[22511.249136]   SET = 0, FnV = 0
[22511.249403]   EA = 0, S1PTW = 0
[22511.249678] Data abort info:
[22511.249962]   ISV = 0, ISS = 0x00000004
[22511.250299]   CM = 0, WnR = 0
[22511.250561] [fbff0000f2150160] address between user and kernel address ranges
[22511.251185] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[22511.251672] Modules linked in: pppoe ppp_async wireguard snd_usb_audio pppox ppp_generic nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet libchacha20poly1305 libblake2s ipt_REJECT ebtable_nat ebtable_filter ebtable_broute chacha_neon zstd xt_u32 xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_socket xt_recent xt_quota xt_policy xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length2 xt_length xt_ipv4options xt_iprange xt_iface xt_hl xt_helper xt_esp xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connlabel xt_connbytes xt_condition xt_comment xt_cluster xt_addrtype xt_TPROXY xt_TCPMSS xt_REDIRECT xt_PROTO xt_NFQUEUE xt_NFLOG xt_NETMAP xt_MASQUERADE xt_LOGMARK xt_LOG xt_IPMARK xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CT xt_CLASSIFY xfrm_interface usbnet snd_usbmidi_lib snd_usb_caiaq slhc sch_cake rockchipdrm rc_core r8168 r8152 poly1305_neon phy_rockchip_inno_hdmi nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject_bridge nft_reject nft_redir
[22511.251739]  nft_quota nft_queue nft_objref nft_numgen nft_nat nft_meta_bridge nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_ct nft_counter nft_chain_nat nfnetlink_queue nfnetlink_log nf_tproxy_ipv6 nf_tproxy_ipv4 nf_tables_set nf_tables nf_socket_ipv6 nf_socket_ipv4 nf_reject_ipv4 nf_nat_ftp nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_conntrack_netlink nf_conntrack_ftp nf_conncount macvlan lzo libcurve25519_generic libchacha libblake2s_generic iptable_raw iptable_nat iptable_mangle iptable_filter ipt_rpfilter ipt_ah ipt_ECN ipt_CLUSTERIP ip6table_raw ip6t_rpfilter ip_tables ebtables ebt_vlan ebt_stp ebt_redirect ebt_pkttype ebt_mark_m ebt_mark ebt_limit ebt_among ebt_802_3 dw_mipi_dsi dw_hdmi_cec dw_hdmi drm_kms_helper crc_ccitt compat_xtables cec br_netfilter arptable_filter arpt_mangle arp_tables act_connmark sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact configs cryptodev
[22511.259374]  xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_NPT ip6t_rt ip6t_mh ip6t_ipv6header ip6t_hbh ip6t_frag ip6t_eui64 ip6t_ah nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ifb dummy ip6_vti ip_vti ipcomp6 xfrm6_tunnel esp6 ah6 xfrm4_tunnel ipcomp esp4 ah4 ip6_tunnel tunnel6 tunnel4 ip_tunnel veth tun snd_rawmidi snd_seq_device snd_pcm_oss snd_pcm snd_timer snd_mixer_oss snd_hwdep snd_compress snd soundcore xfrm_user xfrm_ipcomp af_key xfrm_algo vfat fat cifs dm_mirror dm_region_hash dm_log dm_crypt dm_mod dax nls_utf8 nls_cp437 vxlan udp_tunnel ip6_udp_tunnel crypto_user algif_rng algif_aead seqiv md4
[22511.266977]  ghash_generic gcm echainiv des_generic libdes deflate ctr ccm authenc arc4 crypto_acompress sysimgblt sysfillrect syscopyarea fb_sys_fops cfbimgblt cfbfillrect cfbcopyarea fb font drm drm_panel_orientation_quirks dwc2 fsl_mph_dr_of ehci_fsl gpio_button_hotplug btrfs xor zstd_decompress zstd_compress xor_neon raid6_pq lzo_decompress lzo_compress udc_core cbc encrypted_keys trusted tpm
[22511.277614] CPU: 4 PID: 152 Comm: sugov:4 Not tainted 5.4.179 #0
[22511.278137] Hardware name: FriendlyElec NanoPi R4S (DT)
[22511.278593] pstate: a0000005 (NzCv daif -PAN -UAO)
[22511.279021] pc : clk_change_rate+0xd8/0x2b0
[22511.279387] lr : clk_change_rate+0xf0/0x2b0
[22511.279752] sp : ffff8000110c3b50
[22511.280042] x29: ffff8000110c3b50 x28: ffff0000f0f631e0
[22511.280506] x27: ffff0000f0f5ec00 x26: ffff800010ca0f78
[22511.280970] x25: 0000000018519600 x24: 0000000000000000
[22511.281433] x23: ffff0000f6fbe098 x22: 0000000018519600
[22511.281897] x21: 000000006b49d200 x20: fbff0000f21501b8
[22511.282361] x19: ffff0000f2199700 x18: 0000000000000000
[22511.282824] x17: 0000000000000000 x16: 0000000000000000
[22511.283287] x15: 0000000000000000 x14: 0000000000000000
[22511.283751] x13: 0000000000000001 x12: 0000000000000001
[22511.284214] x11: 0000000000000001 x10: 00000000000008c0
[22511.284677] x9 : ffff8000110c3640 x8 : ffff0000f0f34a20
[22511.285140] x7 : 0000000000000000 x6 : 0000000000006808
[22511.285604] x5 : 0000000000000005 x4 : 0000000000000028
[22511.286067] x3 : 0000000000000000 x2 : 0000000000000009
[22511.286530] x1 : 0000000000000008 x0 : fbff0000f2150100
[22511.286994] Call trace:
[22511.287212]  clk_change_rate+0xd8/0x2b0
[22511.287547]  clk_change_rate+0xf0/0x2b0
[22511.287882]  clk_change_rate+0xf0/0x2b0
[22511.288218]  clk_change_rate+0xf0/0x2b0
[22511.288555]  clk_core_set_rate_nolock+0x13c/0x220
[22511.288966]  clk_set_rate+0x34/0x14c
[22511.289282]  dev_pm_opp_set_rate+0x2b0/0x550
[22511.289656]  set_target+0x3c/0x80
[22511.289947]  __cpufreq_driver_target+0x258/0x540
[22511.290352]  sugov_work+0x50/0x6c
[22511.290644]  kthread_worker_fn+0x9c/0x17c
[22511.290995]  kthread+0x14c/0x150
[22511.291279]  ret_from_fork+0x10/0x24
[22511.291597] Code: f102e000 54000061 1400000a 54000120 (f9403001)
[22511.292130] ---[ end trace 915af716f8f8ad2f ]---

After that stuff like "touch 1" hang indefinitely, only a hard reboot helps.

I'm using:

root@OpenWrt:~# cat /etc/openwrt_version
r16485-59e7ae8d65
root@OpenWrt:~# cat /etc/openwrt_release
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='21.02-SNAPSHOT'
DISTRIB_REVISION='r16485-59e7ae8d65'
DISTRIB_TARGET='rockchip/armv8'
DISTRIB_ARCH='aarch64_generic'
DISTRIB_TAINTS='no-all'
DISTRIB_DESCRIPTION='AO Build@2022.02.14'

Could that be a power issue? Every time that seems to be the clk_change_rate function which is suspicious.

anaelorlinski commented 2 years ago

From my experience you can detect missing power by running coremark 2 or 3 times in a row. if the power adapter is too weak then the R4s would reboot

koolkhel commented 2 years ago

coremark did not produce an oops. Still happens sporadically, after a few hours of work. I've enabled panic on oops, it helps but still sometimes the device becomes unresponsive. Network LEDs remain lit and only a full power cycle helps.

koolkhel commented 2 years ago

I took a current OpenWRT snapshot that now apparently supports NanoPi R4S. Uptime is 60 hours already, no issues.

anaelorlinski commented 2 years ago

Good to hear that snapshots work well in your case. I would be curious to know the reason of the oops but on my side my R4s has uptime of more than 15 days with my builds as a router.