crash-utility / crash

Linux kernel crash utility
https://crash-utility.github.io
837 stars 274 forks source link

[ARM64] failed to parse ramdump with crash 8.0.5 #189

Open byron-wang opened 2 months ago

byron-wang commented 2 months ago

~/temp/0827-2$ ./crash SYS_COREDUMP vmlinux

crash 8.0.5 Copyright (C) 2002-2024 Red Hat, Inc. Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005, 2011, 2020-2024 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. Copyright (C) 2015, 2021 VMware, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 10.2 Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=x86_64-pc-linux-gnu --target=aarch64-elf-linux". Type "show configuration" for configuration details. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.

For help, type "help". Type "apropos word" to search for commands related to "word"...

crash: invalid kernel virtual address: ffffffda1ba01cc0 type: "kernel_config_data" WARNING: cannot read kernel_config_data crash: invalid kernel virtual address: ffffffda1c98aeb0 type: "possible" WARNING: cannot read cpu_possible_map crash: invalid kernel virtual address: ffffffda1c98aea8 type: "present" WARNING: cannot read cpu_present_map crash: invalid kernel virtual address: ffffffda1c98aea0 type: "online" WARNING: cannot read cpu_online_map crash: invalid kernel virtual address: ffffffda1c98aeb8 type: "active" WARNING: cannot read cpu_active_map crash: invalid kernel virtual address: ffffffda1cb9aa48 type: "shadow_timekeeper xtime_sec" crash: invalid kernel virtual address: ffffffda1cb13ea0 type: "init_uts_ns" WARNING: invalid linux_banner pointer: ffffffda1bbd3328 crash: vmlinux and SYS_COREDUMP do not match!

then, tried to modify arm64.c as below, and it worked normally. please help check it.

static int
arm64_set_va_bits_by_tcr(void)
{
    ulong value;

    if (arm64_get_vmcoreinfo(&value, "NUMBER(TCR_EL1_T1SZ)", NUM_DEC) ||     // from NUM_HEX to NUM_DEC
        arm64_get_vmcoreinfo(&value, "NUMBER(tcr_el1_t1sz)", NUM_DEC)) {     // from NUM_HEX to NUM_DEC

Thanks.

kylee0215 commented 2 months ago

Hi, In kernel-v6.11-rc6, the kernel uses TCR_EL1_T1SZ as hex in the following function[1].

void arch_crash_save_vmcoreinfo(void)
{
    VMCOREINFO_NUMBER(VA_BITS);
    /* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
    vmcoreinfo_append_str("NUMBER(MODULES_VADDR)=0x%lx\n", MODULES_VADDR);
    vmcoreinfo_append_str("NUMBER(MODULES_END)=0x%lx\n", MODULES_END);
    vmcoreinfo_append_str("NUMBER(VMALLOC_END)=0x%lx\n", VMALLOC_END);
    vmcoreinfo_append_str("NUMBER(VMEMMAP_START)=0x%lx\n", VMEMMAP_START);
    vmcoreinfo_append_str("NUMBER(VMEMMAP_END)=0x%lx\n", VMEMMAP_END);
    vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
                        kimage_voffset);
    vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n",
                        PHYS_OFFSET);
    vmcoreinfo_append_str("NUMBER(TCR_EL1_T1SZ)=0x%llx\n",
                        get_tcr_el1_t1sz());

You can refer to this discussion[2] as well.

[1] https://elixir.bootlin.com/linux/v6.11-rc6/source/arch/arm64/kernel/vmcore_info.c#L33 [2] https://lists.crash-utility.osci.io/archives/list/devel@lists.crash-utility.osci.io/thread/LUCYZ5JBGEH2R422JFPJC5IUPMDRGCRS/

byron-wang commented 2 months ago

Hi,

Thanks for your reply.

retrieve the vmcoreinfo from ramdump, and list the snippets as below,

NUMBER(VA_BITS)=39 NUMBER(PHYS_OFFSET)=0x40000000 NUMBER(TCR_EL1_T1SZ)=25 NUMBER(KERNELPACMASK)=0x0

the "25" has no a prefix "0x", so could it be taken as decimal during parsing, or ought to be written as hexadecimal with hex prefix here?

Thanks.