If we use crash to parse ramdump(Qcom phone device) rathen than vmcore. Start command should be like: crash vmlinux --kaslr=xxx DDRCS0_0.BIN@0x0000000080000000,... --machdep vabits_actual=39 Then We will see bt command show misleading backtrace information below:
So we check the CONFIG_ARM64_PTR_AUTH and CONFIG_ARM64_PTR_AUTH_KERNEL to double check if pac mechanism been enabled on this ramdump. Then we use vabits to figure it out.
Fix then show the right backtrace below:
crash> bt 16930
PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr"
0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4
1 [ffffffc034c43850] __schedule at ffffffe004cf05a0
2 [ffffffc034c438b0] preempt_schedule_common at ffffffe004ceff80
3 [ffffffc034c43950] unmap_page_range at ffffffe003a7b120
4 [ffffffc034c439f0] unmap_vmas at ffffffe003a80a64
5 [ffffffc034c43ac0] exit_mmap at ffffffe003a945c4
If we use crash to parse ramdump(Qcom phone device) rathen than vmcore. Start command should be like: crash vmlinux --kaslr=xxx DDRCS0_0.BIN@0x0000000080000000,... --machdep vabits_actual=39 Then We will see bt command show misleading backtrace information below:
crash> bt 16930 PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr"
0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4
1 [ffffffc034c43850] __kvmnvhe$d.2314 at 6be732e004cf05a0
2 [ffffffc034c438b0] __kvmnvhe$d.2314 at 86c54c6004ceff80
3 [ffffffc034c43950] __kvmnvhe$d.2314 at 55d6f96003a7b120
4 [ffffffc034c439f0] __kvmnvhe$d.2314 at 9ccec46003a80a64
5 [ffffffc034c43ac0] __kvmnvhe$d.2314 at 8cf41e6003a945c4
6 [ffffffc034c43b10] __kvmnvhe$d.2314 at a8f181e00372c818
7 [ffffffc034c43b40] __kvmnvhe$d.2314 at 6dedde600372c0d0
8 [ffffffc034c43b90] __kvmnvhe$d.2314 at 62cc07e00373d0ac
9 [ffffffc034c43c00] __kvmnvhe$d.2314 at 72fb1de00373bedc
... PC: 00000073f5294840 LR: 00000070d8f39ba4 SP: 00000070d4afd5d0 X29: 00000070d4afd600 X28: b4000071efcda7f0 X27: 00000070d4afe000 X26: 0000000000000000 X25: 00000070d9616000 X24: 0000000000000000 X23: 0000000000000000 X22: 0000000000000000 X21: 0000000000000000 X20: b40000728fd27520 X19: b40000728fd27550 X18: 000000702daba000 X17: 00000073f5294820 X16: 00000070d940f9d8 X15: 00000000000000bf X14: 0000000000000000 X13: 00000070d8ad2fac X12: b40000718fce5040 X11: 0000000000000000 X10: 0000000000000070 X9: 0000000000000001 X8: 0000000000000062 X7: 0000000000000020 X6: 0000000000000000 X5: 0000000000000000 X4: 0000000000000000 X3: 0000000000000000 X2: 0000000000000002 X1: 0000000000000080 X0: b40000728fd27550 ORIG_X0: b40000728fd27550 SYSCALLNO: ffffffff PSTATE: 40001000
By checking the raw data below, will see the lr (fp+8) data show the pointer which already been replaced by PAC prefix.
crash> bt -f PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr"
0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4
1 [ffffffc034c43850] __kvmnvhe$d.2314 at 6be732e004cf05a0
...
So we check the CONFIG_ARM64_PTR_AUTH and CONFIG_ARM64_PTR_AUTH_KERNEL to double check if pac mechanism been enabled on this ramdump. Then we use vabits to figure it out. Fix then show the right backtrace below: crash> bt 16930 PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr"
0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4
1 [ffffffc034c43850] __schedule at ffffffe004cf05a0
2 [ffffffc034c438b0] preempt_schedule_common at ffffffe004ceff80
3 [ffffffc034c43950] unmap_page_range at ffffffe003a7b120
4 [ffffffc034c439f0] unmap_vmas at ffffffe003a80a64
5 [ffffffc034c43ac0] exit_mmap at ffffffe003a945c4
6 [ffffffc034c43b10] __mmput at ffffffe00372c818
7 [ffffffc034c43b40] mmput at ffffffe00372c0d0
8 [ffffffc034c43b90] exit_mm at ffffffe00373d0ac
9 [ffffffc034c43c00] do_exit at ffffffe00373bedc
Let's use GENMASK to replace the pac pointer to fix it. gki related commit url here: https://lore.kernel.org/all/20230412160134.306148-4-mark.rutland@arm.com/