linux-sunxi / meta-sunxi

Official sunxi OpenEmbedded layer for Allwinner-based boards.
MIT License
140 stars 172 forks source link

Detected stalls on CPUs/tasks on Orange PI R1 with Allwinner H3 sun8i Family #365

Closed b0ned1ger closed 1 year ago

b0ned1ger commented 1 year ago

Using linux/linux-mainline_5.15.35.bb with unmodified defconfig, I am receiving periodic dumps to serial console:

[48533.084229] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[48533.090203] rcu:     3-...!: (0 ticks this GP) idle=45a/0/0x0 softirq=9273/9273 fqs=0  (false positive?)
[48533.099452]  (detected by 1, t=2103 jiffies, g=32685, q=49)
[48533.105041] Sending NMI from CPU 1 to CPUs 3:
[48533.109417] NMI backtrace for cpu 3
[48533.109431] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G           O      5.15.35 #1
[48533.109446] Hardware name: Allwinner sun8i Family
[48533.109453] PC is at arch_cpu_idle+0x38/0x3c
[48533.109478] LR is at arch_cpu_idle+0x34/0x3c
[48533.109492] pc : [<c010722c>]    lr : [<c0107228>]    psr: 60000013
[48533.109502] sp : c105ffa8  ip : c0c06014  fp : 00000000
[48533.109511] r10: c105e000  r9 : c0b5dc28  r8 : 00000000
[48533.109519] r7 : c0c05f90  r6 : c0c05f4c  r5 : 00000003  r4 : c105e000
[48533.109530] r3 : c0116780  r2 : 00000001  r1 : 00000000  r0 : 0ebfc45e
[48533.109540] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[48533.109554] Control: 10c5387d  Table: 4968006a  DAC: 00000051
[48533.109561] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G           O      5.15.35 #1
[48533.109574] Hardware name: Allwinner sun8i Family
[48533.109588] [<c010d7c0>] (unwind_backtrace) from [<c0109d9c>] (show_stack+0x10/0x14)
[48533.109618] [<c0109d9c>] (show_stack) from [<c080ab28>] (dump_stack_lvl+0x40/0x4c)
[48533.109644] [<c080ab28>] (dump_stack_lvl) from [<c0478d8c>] (nmi_cpu_backtrace+0xc4/0x110)
[48533.109670] [<c0478d8c>] (nmi_cpu_backtrace) from [<c010bf3c>] (do_handle_IPI+0x5c/0x12c)
[48533.109693] [<c010bf3c>] (do_handle_IPI) from [<c010c024>] (ipi_handler+0x18/0x20)
[48533.109717] [<c010c024>] (ipi_handler) from [<c01854c0>] (handle_percpu_devid_irq+0x78/0x13c)
[48533.109748] [<c01854c0>] (handle_percpu_devid_irq) from [<c017f20c>] (handle_domain_irq+0x5c/0x78)
[48533.109778] [<c017f20c>] (handle_domain_irq) from [<c048b8d8>] (gic_handle_irq+0x7c/0x90)
[48533.109802] [<c048b8d8>] (gic_handle_irq) from [<c0100b7c>] (__irq_svc+0x5c/0x78)
[48533.109824] Exception stack(0xc105ff58 to 0xc105ffa0)
[48533.109837] ff40:                                                       0ebfc45e 00000000
[48533.109853] ff60: 00000001 c0116780 c105e000 00000003 c0c05f4c c0c05f90 00000000 c0b5dc28
[48533.109870] ff80: c105e000 00000000 c0c06014 c105ffa8 c0107228 c010722c 60000013 ffffffff
[48533.109880] [<c0100b7c>] (__irq_svc) from [<c010722c>] (arch_cpu_idle+0x38/0x3c)
[48533.109907] [<c010722c>] (arch_cpu_idle) from [<c0157d48>] (do_idle+0x22c/0x2dc)
[48533.109939] [<c0157d48>] (do_idle) from [<c0158104>] (cpu_startup_entry+0x18/0x1c)
[48533.109969] [<c0158104>] (cpu_startup_entry) from [<40101530>] (0x40101530)
[48533.110416] rcu: rcu_sched kthread timer wakeup didn't happen for 2103 jiffies! g32685 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x42
[48533.332383] rcu:     Possible timer handling issue on cpu=2 timer-softirq=2762
[48533.339357] rcu: rcu_sched kthread starved for 2127 jiffies! g32685 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=2
[48533.349639] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[48533.359145] rcu: RCU grace-period kthread stack dump:
[48533.364457] task:rcu_sched       state:I stack:    0 pid:   12 ppid:     2 flags:0x00000000
[48533.372829] [<c080f2c4>] (__schedule) from [<c080f7e0>] (schedule+0x60/0xdc)
[48533.379915] [<c080f7e0>] (schedule) from [<c08149e0>] (schedule_timeout+0x84/0xf4)
[48533.387518] [<c08149e0>] (schedule_timeout) from [<c0191084>] (rcu_gp_fqs_loop+0x100/0x348)
[48533.395907] [<c0191084>] (rcu_gp_fqs_loop) from [<c0192ed8>] (rcu_gp_kthread+0xc8/0x14c)
[48533.404035] [<c0192ed8>] (rcu_gp_kthread) from [<c01494a4>] (kthread+0x140/0x160)
[48533.411555] [<c01494a4>] (kthread) from [<c0100130>] (ret_from_fork+0x14/0x24)
[48533.418806] Exception stack(0xc105bfb0 to 0xc105bff8)
[48533.423872] bfa0:                                     00000000 00000000 00000000 00000000
[48533.432070] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[48533.440265] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[48533.446890] rcu: Stack dump where RCU GP kthread last ran:
[48533.452385] Sending NMI from CPU 1 to CPUs 2:
[48533.456756] NMI backtrace for cpu 2
[48533.456769] CPU: 2 PID: 92 Comm: sugov:0 Tainted: G           O      5.15.35 #1
[48533.456786] Hardware name: Allwinner sun8i Family
[48533.456792] PC is at arch_counter_get_cntpct+0x4/0xc
[48533.456815] LR is at arch_timer_read_counter_long+0x14/0x18
[48533.456830] pc : [<c06259f4>]    lr : [<c010d124>]    psr: a0000013
[48533.456839] sp : c1ac5d38  ip : c1174200  fp : 0ccccb60
[48533.456848] r10: 0013d620  r9 : c0a4a408  r8 : 00000000
[48533.456857] r7 : c1ac4000  r6 : 6dbafac9  r5 : 00005dbf  r4 : c0d004b8
[48533.456867] r3 : c06259f0  r2 : 00030d40  r1 : 0000010f  r0 : 6dbb41c6
[48533.456878] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[48533.456891] Control: 10c5387d  Table: 4967c06a  DAC: 00000051
[48533.456898] CPU: 2 PID: 92 Comm: sugov:0 Tainted: G           O      5.15.35 #1
[48533.456912] Hardware name: Allwinner sun8i Family
[48533.456920] [<c010d7c0>] (unwind_backtrace) from [<c0109d9c>] (show_stack+0x10/0x14)
[48533.456947] [<c0109d9c>] (show_stack) from [<c080ab28>] (dump_stack_lvl+0x40/0x4c)
[48533.456972] [<c080ab28>] (dump_stack_lvl) from [<c0478d8c>] (nmi_cpu_backtrace+0xc4/0x110)
[48533.456997] [<c0478d8c>] (nmi_cpu_backtrace) from [<c010bf3c>] (do_handle_IPI+0x5c/0x12c)
[48533.457020] [<c010bf3c>] (do_handle_IPI) from [<c010c024>] (ipi_handler+0x18/0x20)
[48533.457043] [<c010c024>] (ipi_handler) from [<c01854c0>] (handle_percpu_devid_irq+0x78/0x13c)
[48533.457073] [<c01854c0>] (handle_percpu_devid_irq) from [<c017f20c>] (handle_domain_irq+0x5c/0x78)
[48533.457102] [<c017f20c>] (handle_domain_irq) from [<c048b8d8>] (gic_handle_irq+0x7c/0x90)
[48533.457126] [<c048b8d8>] (gic_handle_irq) from [<c0100b7c>] (__irq_svc+0x5c/0x78)
[48533.457147] Exception stack(0xc1ac5ce8 to 0xc1ac5d30)
[48533.457163] 5ce0:                   6dbb41c6 0000010f 00030d40 c06259f0 c0d004b8 00005dbf
[48533.457180] 5d00: 6dbafac9 c1ac4000 00000000 c0a4a408 0013d620 0ccccb60 c1174200 c1ac5d38
[48533.457193] 5d20: c010d124 c06259f4 a0000013 ffffffff
[48533.457202] [<c0100b7c>] (__irq_svc) from [<c06259f4>] (arch_counter_get_cntpct+0x4/0xc)
[48533.457231] [<c06259f4>] (arch_counter_get_cntpct) from [<c010d124>] (arch_timer_read_counter_long+0x14/0x18)
[48533.457261] [<c010d124>] (arch_timer_read_counter_long) from [<c046e658>] (__timer_delay+0x38/0x60)
[48533.457292] [<c046e658>] (__timer_delay) from [<c04c9b60>] (_regulator_do_set_voltage+0x160/0x468)
[48533.457324] [<c04c9b60>] (_regulator_do_set_voltage) from [<c04cda3c>] (regulator_set_voltage_rdev+0x98/0x254)
[48533.457350] [<c04cda3c>] (regulator_set_voltage_rdev) from [<c04cb138>] (regulator_do_balance_voltage+0x330/0x4b0)
[48533.457377] [<c04cb138>] (regulator_do_balance_voltage) from [<c04cd960>] (regulator_set_voltage_unlocked+0xd8/0x11c)
[48533.457403] [<c04cd960>] (regulator_set_voltage_unlocked) from [<c04cdc40>] (regulator_set_voltage+0x48/0x7c)
[48533.457430] [<c04cdc40>] (regulator_set_voltage) from [<c05fd670>] (_set_opp_voltage+0x30/0x8c)
[48533.457459] [<c05fd670>] (_set_opp_voltage) from [<c06009dc>] (_set_opp+0x1f8/0x554)
[48533.457488] [<c06009dc>] (_set_opp) from [<c0600e28>] (dev_pm_opp_set_rate+0xf0/0x214)
[48533.457516] [<c0600e28>] (dev_pm_opp_set_rate) from [<c0604cb8>] (__cpufreq_driver_target+0x16c/0x23c)
[48533.457544] [<c0604cb8>] (__cpufreq_driver_target) from [<c016dcd0>] (sugov_work+0x48/0x54)
[48533.457569] [<c016dcd0>] (sugov_work) from [<c0147e10>] (kthread_worker_fn+0x9c/0x1f4)
[48533.457592] [<c0147e10>] (kthread_worker_fn) from [<c01494a4>] (kthread+0x140/0x160)
[48533.457615] [<c01494a4>] (kthread) from [<c0100130>] (ret_from_fork+0x14/0x24)
[48533.457636] Exception stack(0xc1ac5fb0 to 0xc1ac5ff8)
[48533.457649] 5fa0:                                     00000000 00000000 00000000 00000000
[48533.457664] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[48533.457678] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000

Is this a known issue, maybe there is a fix for it?

nandra commented 1 year ago

@b0ned1ger I'm running few boards with 5.15.35 but didn't experience such issue. Did you try other kernel? Are you on kirkstone branch? Thanks.

nandra commented 1 year ago

@b0ned1ger any update from you? Can we close it pls? Thanks.

b0ned1ger commented 1 year ago

Hey, sorry for delay. I only reproduce this when booting kernel this kernel with initramfs. Booting out of SD card does not seem to have this problem. This might be my bad.