skiffos / SkiffOS

Any Linux distribution, anywhere.
https://skiffos.com
MIT License
705 stars 52 forks source link

bananapi/m2ultra: kernel bug: stmmac panic on shutdown #307

Closed paralin closed 9 months ago

paralin commented 9 months ago

https://lore.kernel.org/netdev/ZcoL0MseDC69s2_P@torres.zugschlus.de/#R

Error on shutdown:

[  449.023771] rcu: INFO: rcu_sched self-detected stall on CPU
[  449.029390] rcu:     0-....: (21004 ticks this GP) idle=0fdc/1/0x40000002 softirq=12003/12003 fqs=10318
[  449.038534] rcu:     (t=21056 jiffies g=16277 q=223 ncpus=4)
[  449.043941] CPU: 0 PID: 4462 Comm: ip Tainted: G         C         6.7.4 #1
[  449.050907] Hardware name: Allwinner sun8i Family
[  449.055612] PC is at stmmac_get_stats64+0x30/0x198
[  449.060426] LR is at dev_get_stats+0x3c/0x160
[  449.064790] pc : [<c06b9924>]    lr : [<c07bf7a8>]    psr: 200f0013
[  449.071057] sp : f1e6d9b8  ip : c3ca478c  fp : c23e0000
[  449.076285] r10: 00000000  r9 : c3ca4598  r8 : 00000000
[  449.081512] r7 : 00000001  r6 : 00000000  r5 : c23e3000  r4 : 00000001
[  449.088040] r3 : 000063eb  r2 : c23e2e08  r1 : c3ca46c4  r0 : c23e0000
[  449.094570] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  449.101708] Control: 10c5387d  Table: 429cc06a  DAC: 00000051
[  449.107464]  stmmac_get_stats64 from dev_get_stats+0x3c/0x160
[  449.113226]  dev_get_stats from rtnl_fill_stats+0x30/0x118
[  449.118723]  rtnl_fill_stats from rtnl_fill_ifinfo+0x720/0x135c
[  449.124656]  rtnl_fill_ifinfo from rtnl_dump_ifinfo+0x330/0x6a8
[  449.130588]  rtnl_dump_ifinfo from netlink_dump+0x16c/0x350
[  449.136178]  netlink_dump from __netlink_dump_start+0x1bc/0x280
[  449.142114]  __netlink_dump_start from rtnetlink_rcv_msg+0xf4/0x2f0
[  449.148394]  rtnetlink_rcv_msg from netlink_rcv_skb+0xb8/0x118
[  449.154241]  netlink_rcv_skb from netlink_unicast+0x1fc/0x2d8
[  449.160003]  netlink_unicast from netlink_sendmsg+0x1c8/0x440
[  449.165766]  netlink_sendmsg from sock_write_iter+0xa0/0x10c
[  449.171444]  sock_write_iter from vfs_write+0x338/0x398
[  449.176684]  vfs_write from ksys_write+0xbc/0xf0
[  449.181311]  ksys_write from ret_fast_syscall+0x0/0x54

Appears to be a kernel bug that has been present since 6.6.x (we are currently on 6.7.5).

paralin commented 9 months ago

Fix: https://lore.kernel.org/lkml/Zb8QwgYSh9sX8zBi@xhacker/T/

Merged into 6.8-rc4 already: https://github.com/torvalds/linux/commit/38cc3c6dcc09dc3a1800b5ec22aef643ca11eab8

Will cherry-pick a patch

paralin commented 9 months ago

https://lore.kernel.org/netdev/20240219204421.2f6019c1@meshulam.tesarici.cz/

If you're running a 6.7 stable kernel, my patch has just been added to the 6.7-stable tree.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-6.7/net-stmmac-protect-updates-of-64-bit-statistics-counters.patch

However, lockdep has reported an issue with it:

https://lore.kernel.org/lkml/ea1567d9-ce66-45e6-8168-ac40a47d1821@roeck-us.net/

This new report has not yet been properly understood, but FWIW I've been running stable with my patch for over a month now.

Petr T