Open adamliyi opened 1 year ago
With PR: https://github.com/AmpereComputing/ampere-lts-kernel/pull/180 , 'NMI IPI' is supported in 5.15 kernel.
Kernel config:
set CONFIG_ARM64_PSEUDO_NMI=y
add irqchip.gicv3_pseudo_nmi=1 in cmdline
set CONFIG_LKDTM=y <------------ to trigger soft lockup for testing
Enable 'all cpu backtrace' when soft lockup
# cat /proc/sys/kernel/softlockup_all_cpu_backtrace
0
# echo 1 > /proc/sys/kernel/softlockup_all_cpu_backtrace
# cat /proc/sys/kernel/softlockup_all_cpu_backtrace
1
# echo SOFTLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT <-------- Trigger soft lockup
When soft lockup is triggered, 'NMI backtrace' is dump for all cpus (160 cores on Altra 2P)
[ 409.605281] x2 : b73f909504dff200 x1 : 0000000000000000 x0 : 0000ffffacb10738
[ 409.605291] NMI backtrace for cpu 37
[ 409.605312] CPU: 37 PID: 55495 Comm: grep Tainted: G L 5.15.23+ #3
[ 409.605326] Hardware name: WIWYNN Mt.Jade Server System B81.030Z1.0007/Mt.Jade Motherboard, BIOS 2.10.20220531 (SCP: 2.10.20220531) 2022/05/31
[ 409.605335] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 409.605341] pc : inode_permission+0x34/0x1d0
[ 409.605347] lr : link_path_walk+0x10c/0x3a0
[ 409.605349] sp : ffff800063283ac0
[ 409.605351] pmr_save: 000000e0
[ 409.605352] x29: ffff800063283ac0 x28: 0000000000000000 x27: 0000000000000001
[ 409.605356] x26: 2f2f2f2f2f2f2f2f x25: d0d0d0d0d0d0d0d0 x24: fefefefefefefeff
[ 409.605358] x23: ffff800063283c88 x22: 0000000000000000 x21: ffffb3bc7f3f7cd8
[ 409.605361] x20: 0000000000000081 x19: ffff07ff85d16280 x18: 0000000000000000
[ 409.605363] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 409.605366] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 409.605369] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffb3bc7e1d3574
[ 409.605372] x8 : 000000008d962615 x7 : 3662696c2f727375 x6 : 000000000003ffff
[ 409.605374] x5 : ffff8000086f3000 x4 : 0000000000000012 x3 : ffff07ff83f09ba0
[ 409.605377] x2 : 0000000000000081 x1 : ffff07ff85d16280 x0 : 000000000000000b
[ 409.605380] Call trace:
[ 409.605381] inode_permission+0x34/0x1d0
[ 409.605384] link_path_walk+0x10c/0x3a0
[ 409.605386] path_openat+0x1a4/0xef8
[ 409.605388] do_filp_open+0x94/0x110
[ 409.605390] do_sys_openat2+0x204/0x2e8
[ 409.605393] do_sys_open+0x84/0xa8
[ 409.605396] __arm64_sys_openat+0x2c/0x38
[ 409.605399] invoke_syscall+0x7c/0x100
[ 409.605402] el0_svc_common.constprop.3+0x170/0x1a0
[ 409.605404] do_el0_svc+0x68/0x80
[ 409.605407] el0_svc+0x68/0xd8
[ 409.605410] el0t_64_sync_handler+0x40/0xb8
[ 409.605413] el0t_64_sync+0x180/0x184
[ 409.605422] NMI backtrace for cpu 44
[ 409.605427] CPU: 44 PID: 40189 Comm: genksyms Tainted: G L 5.15.23+ #3
[ 409.605452] Hardware name: WIWYNN Mt.Jade Server System B81.030Z1.0007/Mt.Jade Motherboard, BIOS 2.10.20220531 (SCP: 2.10.20220531) 2022/05/31
[ 409.605462] pstate: 80001000 (Nzcv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
[ 409.605468] pc : 0000000000403988
[ 409.605468] lr : 0000000000403984
[ 409.605470] sp : 0000fffffa7a36b0
[ 409.605471] pmr_save: 000000e0
According to: https://patchwork.kernel.org/project/linux-arm-kernel/cover/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/ . It looks above patch may not be merged to upstream in near future?
[Marc Zyngier](https://patchwork.kernel.org/project/linux-arm-kernel/list/?submitter=187307)Jan. 5, 2021, 10:43 a.m. UTC | [#2](https://patchwork.kernel.org/comment/23872547/)
On 2021-01-05 10:34, Sumit Garg wrote:
> Do you have any further feedback on this patch-set?
None at the moment. We have tons of issues to solve with the arm64
interrupt entry code vs instrumentation at the moment, so it is
pretty much at the bottom of the priority list for now.
In doc: https://www.kernel.org/doc/Documentation/admin-guide/sysctl/kernel.rst
We might need to apply the NMI IPI patches to make this happen:
https://patchwork.kernel.org/project/linux-arm-kernel/cover/20190506082542.11357-1-liwei391@huawei.com/ : [0/3] arm64: Add support for on-demand backtrace by NMI-like IPI
https://lore.kernel.org/linux-arm-kernel/CAFA6WYO0+LQ=mB1spCstt0cNZ0G+sZu_+Wrv6BKSeXqF5SRq4A@mail.gmail.com/T/ arm64: Add framework to turn an IPI as NMI
Check, backport these patches and test. Target for 5.15 kernel firstly.